[ The PC Guide | Systems and Components Reference Guide | Hard Disk Drives | Hard Disk Performance, Quality and Reliability | Redundant Arrays of Inexpensive Disks (RAID) | RAID Concepts and Issues | RAID Performance Issues ] Stripe Width and Stripe Size RAID arrays that use striping improve performance by splitting up files into small pieces and distributing them to multiple hard disks. Most striping implementations allow the creator of the array control over two critical parameters that define the way that the data is broken into chunks and sent to the various disks. Each of these factors has an important impact on the performance of a striped array. The first key parameter is the stripe width of the array. Stripe width refers to the number of parallel stripes that can be written to or read from simultaneously. This is of course equal to the number of disks in the array. So a four-disk striped array would have a stripe width of four. Read and write performance of a striped array increases as stripe width increases, all else being equal. The reason is that adding more drives to the array increases the parallelism of the array, allowing access to more drives simultaneously. You will generally have superior transfer performance from an array of eight 18 GB drives than from an array of four 36 GB of the same drive family, all else being equal. Of course, the cost of eight 18 GB drives is higher than that of four 36 GB drives, and there are other concerns such as power supply to be dealt with. The second important parameter is the stripe size of the array, sometimes also referred to by terms such as block size, chunk size, stripe length or granularity. This term refers to the size of the stripes written to each disk. RAID arrays that stripe in blocks typically allow the selection of block sizes in kiB ranging from 2 kiB to 512 kiB (or even higher) in powers of two (meaning 2 kiB, 4 kiB, 8 kiB and so on.) Byte-level striping (as in RAID 3) uses a stripe size of one byte or perhaps a small number like 512, usually not selectable by the user.
The impact of stripe size upon performance is more difficult to quantify than the effect of stripe width:
Obviously, there is no "optimal stripe size" for everyone; it depends on your performance needs, the types of applications you run, and in fact, even the characteristics of your drives to some extent. (That's why controller manufacturers reserve it as a user-definable value!) There are many "rules of thumb" that are thrown around to tell people how they should choose stripe size, but unfortunately they are all, at best, oversimplified. For example, some say to match the stripe size to the cluster size of FAT file system logical volumes. The theory is that by doing this you can fit an entire cluster in one stripe. Nice theory, but there's no practical way to ensure that each stripe contains exactly one cluster. Even if you could, this optimization only makes sense if you value positioning performance over transfer performance; many people do striping specifically for transfer performance.
So what should you use for a stripe size? The best way to find out is to try different values: empirical evidence is the best for this particular problem. Also, as with most "performance optimizing endeavors", don't overestimate the difference in performance between different stripe sizes; it can be significant, particularly if contrasting values from opposite ends of the spectrum like 4 kiB and 256 kiB, but the difference often isn't all that large between similar values. And if you must have a rule of thumb, I'd say this: transactional environments where you have large numbers of small reads and writes are probably better off with larger stripe sizes (but only to a point); applications where smaller numbers of larger files need to be read quickly will likely prefer smaller stripes. Obviously, if you need to balance these requirements, choose something in the middle. :^)
|