Redundant Array of Inexpensive Disks

From Citizendium
Revision as of 13:21, 14 October 2008 by imported>Howard C. Berkowitz
Jump to navigation Jump to search
This article is developing and not approved.
Main Article
Discussion
Related Articles  [?]
Bibliography  [?]
External Links  [?]
Citable Version  [?]
 
This editable Main Article is under development and subject to a disclaimer.

Template:TOC-right Redundant arrays of inexpensive disks (RAID) are computer mass storage techniques that use multiple physical storage volumes to improve performance and fault tolerance. There are multiple levels of RAID, from a minimal method to improve performance up to methods that allow the system to recover gracefully from multiple hardware failures while permitting damaged parts to be replaced without impacting storage.

The most basic principle of RAID is striping, in which a spreads file across several volumes, virtualized to look like one, such that computer processors can do concurrent reading and writing to several slower disks.

Next most fundamental, and the first that gives fault tolerance, is mirroring. Where striping writes different information to different media. Mirroring writes more than one copy of the same data to multiple media, protecting the physical-level information from individual failures. Remember that the metadata must be protected as well.

For another level of fault tolerance, various schemes provide error-correcting code (sometimes simplistically called parity) that may itself be mirrored.

The term RAID, for redundant array of expensive disks, is an umbrella for a variety of multiple volume techniques. RAID has been extended to cover nonredundant arrays, but does not include SLED (single large expensive disk).

History

RAID features, not called that, appeared in a 1978 patent,[1] which mentioned both mirroring and distributed error correction as prior art. RAID was first described by Patterson and colleagues in 1987, in 1988 a paper which introduced Levels 1 through 5. RAID 6 appears to be replacing RAID 5 as a multivendor solution.[2] There are also a number of vendor-specific RAID variants.

Implementation

RAID is usually implemented primarily with firmware in a disk controller, although there may be some management software on the associated computer. In desktop computers, RAID support is often in the motherboard, although it can also be in one or more of the disk enclosures.

RAID variants

The well-standardized forms of RAID have numbers such as RAID 1, which may be combined (e.g., RAID 0+1) when the set of features does not make up one of the more powerful forms (e.g., RAID 5). Various proprietary feature sets can have vendor-defined numbers, such as RAID 60

Less than full RAID

RAID 0, at the device driver level, creates a large logical disk from an array several smaller physical disks. RAID 0 is more for servers than workstations. The efficiency of RAID 0 in this application relates significantly to controller cache size, and, of course, the ability of the driver and controller to overlap operations.

File:RAID-striping.png
RAID 0 functional drawing

In this context, logical denotes a concept at the level of system software, which, in turn, can have multiple virtual disks mapped onto it. RAID 0 is really not part of the true RAID family. While it does involve arrays of disks, it has no redundancy. Its principle of striping, however, is common to all higher-RAID-level, fault-tolerant devices. Due to the lack of redundancy, it is sometimes disparaged as “just a bunch of disks” (JBOD). RAID 0, however, does improve transfer rates for large transfers.

While it is possible to do RAID 0 with more than two physical disks, from a reliability standpoint, doing so is a very bad idea. Whenever you have a set of elements, all of which are essential to the operation, reliability is inversely proportional to the number of elements. A failure in a striped array, without any other fault tolerance mechanism, will cause the failure to propagate among all elements of the array. The only practical way to recover will be from a backup.

Mirroring and Hot Standby

RAID 1 is pure mirroring of one logical device to two or more physical devices. When sufficiently high bandwidth is available, typically with optical fibers, possibly line-of-sight lasers, or leased optical capacity, RAID 1 mirroring can span physical sites. Geographically dispersed RAID 1 is not equivalent to a database#distribute#distributed database, as the latter implies more complex organization that does RAID alone. RAID, of course, is not incompatible with complex databases; it is merely that they will run on top of the essentially physical level of RAID.

File:RAID-mirroring.png
RAID1: mirroring only

RAID 0+1 combines, or layers in RAID-speak, striping and mirroring, in that order. This method first creates two sets of RAID 0 arrays, and then mirrors from one RAID 0 to the other. The combined arrays can recover from any single volume failure, because the other array will have a clean copy.

File:RAID-0+1.png
RAID0+1: mirroring with striping

After a single device failure, replace the failed drive and restore as soon as possible. Any RAID level containing mirroring can have a hot spare, to which the mirror lost in the event of a failure can be restored.

Error Correction

RAID 5 computes an error-correcting code (ECS) over a set of devices, a code that allows the reconstruction of any single volume failure. This RAID level incorporates striping, during which the ECS is calculated quite efficiently. RAID 5 is the most common technique, offering nearly the reliability of mirroring. The ECS data is stored in multiple devices, as in Figure 11.4.

File:RAID-5.png
RAID 5 distributed error correction

RAID 6 calculates ECS over two or more sets of volumes, allowing error correction for failures affecting two or more volumes. While RAID 5 requires an extra disk for error-correcting storage, RAID 6 would appear to need two extra drives.

File:RAID-6.png
RAID 5 distributed error correction with redundant error-correcting data

In RAID6, however, the hot spare can act as the second extra drive. Especially at an attended data center, or with intelligent scripts, this need not be a problem except at the highest level of fault tolerance. The only constraint to RAID 6 incorporating the hot standby is that restoring data should start only after the failed drive is replaced. replace it before you begin restoring. RAID 6 protects against two failures; after a failure, the system essentially drops back to what is effectively RAID 5 after a single failure. RAID 5 can still correct a single error.

With RAID 5 alone, a single failure will leave the system operating, but vulnerable to one more error. With RAID 6 and good disk management software, failures can often be predicted, and such that a degrading disk can be replaced at a scheduled time.

RAID is not a Panacea

The most fault-tolerant forms of RAID only protect against physical failures. Increasingly reliable disks, and less than ideal software, often means that it is more likely that metadata will become corrupted than it is to have a hardware failure. Metadata that can become corrupted include operating system specific volume mapping tables, and Active Directory in Microsoft Windows systems.

Unless mirroring is at another site, RAID will not protect against destruction of physical disks. Mirroring must involve identical disks and controllers. For stable backup and restoral, treat the RAID system as a unit of controllers and disks.

Software errors, especially common with Windows, involve corrupted physical volume mapping tables, which are damaged during disk defragmentation, so it is always wise to be backed up, even with mirroring, before defragmenting. This may call for a separate transaction-logging server that can generate synthetic backups. It certainly calls for frequent file consistency checks, which may be able to repair corruption in its early stages.

Another approach is to mirror across at least three sets of volumes. Periodically, one set can go offline to application processing, so that a backup can be made of a known stable system. How can it be known if a set is corrupted? Run metadata diagnostics before switching those physical disks back into service.

The three-set approach reflects the “once up, always up” philosophy of telephone carriers. It also allows you to have one set as the primary, one set as the secondary, and one available for maintenance.

  1. K. Ouchi. “System for recovering data in failed memory unit”. US Patent 4,092,732
  2. D. Patterson. G Gibson, G. Katz (1987), A Case for Redundant Arrays of Inexpensive Disks, University of California at Berkeley, Department of Electrical Engineering and Computer Science, Technical Report 87-391