"The more I find out, the less I know."

Friday - November 25, 2005 at 09:05 AM in

Homemade RAID


Our Mac G4 Cube finally bit the big one this week, so on Wednesday I went out and bought a brand-new Mac Mini (and a cheap flat-panel monitor).
Let me begin by saying that the Mini is probably the most amazing computer I've ever owned. It is an astonishingly capable machine, and until you see on in person, you just don't understand how tiny they really are. Imagine a stack of about six CD jewel cases: that's the size of the entire computer.

As had been my practice for the last few dead-Mac transitions, the first thing I did was extract the hard drive from the old computer, put it in an external USB case, and transfer the contents to the new computer. Mac makes it so utterly simple to transfer your life from one computer to another (literally two clicks) that it is truly a joy--especially if you've ever had to go through the pain and agony (as I have, several times) of doing it under Windows.

But I have a problem. Not a let's-call-my-geeky-friend kind of problem, but an I-should-enroll-in-a-12-step-program kind of problem. My problem is that every so often I see how utterly cheap hard drives are, and I can't resist buying a 250GB external drive for ten Cheerios boxtops plus $2.50 postage, or whatever the price happens to be that day. Never mind that I don't have a use for all that storage: it is just so cool that you can get so much for so little that I have to have it.

So combine that problem plus my practice of putting the hard drives from old computers into external cases, and I have this accumulation of external hard drives sitting around basically doing nothing.

Then the obvious idea hit me: Why not build a RAID array from all these external drives? At least then I'd have something coherent (a giant virtual disk with a goodly fraction of a terabyte) instead of a whole bunch of external drives.

I'd always known that Mac OS X comes with built-in RAID capability, but I never had the excuse to explore it.

RAID stands for Redundant Array of Inexpensive Disks (some people use the word "Independent" instead, but either is correct). It was originally designed by cash-strapped researchers to take advantage of the plunge in consumer-grade disk drive prices back in the 90's in order to build bigger, more-reliable virtual hard drives than they could actually afford, by binding together multiple small, unreliable drives.

In fact, a RAID array can provide a bigger, faster, and more reliable storage device than anything you could get from a single drive (no matter how much you wanted to spend), using cheap off-the-shelf drives. It is now the standard way to build large storage systems.

The RAID capability in OS X is relatively easy to set up (as such things go), but it is still a little geeky. It is hidden inside the "Disk Utility" program, and gives you three modes: "Mirrored" which lets you combine multiple drives into a single redundant drive with the capacity of the smallest drive of the set. Each drive contains the entire contents of the array, and if any one drive is functional, then the virtual drive works properly. In other words, you can combine three 80-GB drives into a single 80-GB drive, but any two of the three can break before the virtual drive stops working. "Mirrored" mode is for creating highly reliable virtual disks.

"Striped" combines multiple drives into a single larger drive, spreading the contents of the array across all drives in the array. This gives you both a bigger drive and faster access (since data is read from multiple disks simultaneously). But there's no redundancy: if any one drive in the array fails, the whole thing goes down.

"Concatenated" also combines multiple drives into a single larger drive, like the "Striped" mode, except that a given block of data is only written to a single drive. This has the advantage of letting you add new drives to the array after it is set up, so you can grow the array as needed, and you can combine drives of different sizes without wasting any space.

Right now, the built-in RAID doesn't include more sophisticated modes like N+1 redundancy (where, for example, five drives can be combined into an array with the capacity of four of the five drives, and any one of them can fail without taking the array down). But it does do a couple of other nifty tricks.

One is that you can set up the RAID using individual partitions on a given drive. For example, if you have an 80-GB drive and a 120-GB drive, and you want to set up a mirrored array (which has the capacity of the smallest drive in the array), normally you would lose 40 GB of capacity from the 120-GB drive. But under OS X, you can set up an 80-GB partition on the larger drive for the RAID, and leave yourself with an extra 40-GB partition to use for something else.

The other neat trick is that you can create hybrid arrays, where you combine arrays of arrays. Suppose, for example, that you had a bunch of drives sitting around of random sizes, and you wanted to create a single big array. Normally you would use the "Concatenated" type, but that has the disadvantage of being unreliable: if any drive in the array goes down, the entire array is toast. If you want reliability, too (sacrificing half the capacity), you can create two "Concatenated" arrays which are approximately the same size, then combine those two arrays into a single "Mirrored" array. Then if any one drive fails, the virtual array continues to function normally.

This is not as efficient (in terms of storage space) as N+1 redundancy, but it wins hands-down on flexibility, especially if your basic problem is not building a highly redundant data center, but simply making good use of a bunch of random hard drives sitting around.

So I'm setting two RAID arrays on our Mac Mini. One is a Mirrored array combining the Mini's internal drive and an 80-GB partition on an external drive. This is the boot disk, and has all our precious data like Quicken files, family photos, and the like. If we're careful, we should never lose important stuff.

The other array will be a Concatenated array which will probably have about 600GB capacity. Since this is unreliable storage, it is essentially a giant scratch disk for things we could afford to lose: backup images of CDs, working files for editing home video, and so forth.

So we'll have a small(-ish) virtual disk which will be highly reliable, and a huge virtual disk which will be unreliable.

Now I just need to go find some data.

Posted at 09:05 AM | Permalink | | |

©
Powered By iBlog, Comments By HaloScan
RSS Feed