We accumulate lots of digital assets these days. Photos, music and movies, particularly, require massive storage. Storage is cheap, so keeping up with our storage needs is not unaffordable. The difficult bit, however, is scaling the storage so we don’t go through the bothersome cycle of filling up our current storage, upgrading and migrating to a bigger one, only to fill it up again a year or two later.
You see, what happens with most people is that you start off with one disk. Let’s say it is a 250 GB disk. You outgrow it. So you buy another one, probably 500 GB. So now you have two disks, and whatever won’t fit in the 250 GB spills over into the new 500 GB one.
At some point, you outgrow both disks. You buy a 1 TB one. Now. instead of having a bunch of fragmented disks, and having to decide what stuff goes where, you decide to get a “much bigger” disk so that you can consolidate everything together. The fragmentation is troublesome. Imagine, if you have “Disk A” and “Disk B”, and by your asset distribution algorithm, you decide some piece of new data ought to belong in “Disk A”, but only “Disk B” has the free space.
Then, the problem of leaping to the “much bigger” disk is that they tend to cost a premium. Like everything, the best cost too much. The next best is often good enough, and often good value for money.
I’ve been thinking about building DIY scalable NAS. When one talks about NAS, RAID often also comes to mind. RAID (at the appropriate RAID levels) can give you the reliability or aggregation, or both, of multiple disks. The trouble with RAID, typically, is that you cannot grow. Once the RAID set is defined, you are locked in.
What “better” scalable solutions are there?
For a long time, Sun Microsystems (now part of Oracle) has something known as ZFS. It is, in a nutshell, a combined volume manager and filesystem. One of the most important feature is that you can grow a ZFS filesystem. Yes, you can add disks later. Or you can replace existing disks with a larger ones. It’s really nice.
Another option that is more “license-friendly” (in my humble opinion) is Btrfs. When I first looked at Btrfs a few years ago, there were a few obvious shortcomings. One of them is its relative lesser use in production environments, compared with ZFS. It was still clearly in active development, and there were features missing. In my tests at that time, Btrfs didn’t leave me with much confidence in its robustness too.
A year or two has since passed. I’m looking at Btrfs again with renewed interest. Btrfs now supports RAID 0, RAID 1, RAID 10, RAID 5 and RAID 6. It supports cloning, subvolumes, snapshots, send/receive of diffs, and quota groups. In-band dedup and online fsck are planned. It’s really starting to get interesting. ZFS is still a good option, but ZFS has a bit more limitations in dynamically changing its disk pool.
Btrfs’ flexibility in disk pool lets you do these things:
- Start with one or two disks, and add more as you need.
- Disks don’t even have to be the same size.
- You can have “wrong” number of disks, like using 3 disks (and maximising its space) in a RAID 1 setup.
- You can also remove some disks dynamically if, in case, you decide you need to claim some back for other uses.
- You can dynamically change RAID levels, such as starting with RAID 1 and then deciding to go with RAID 10 later.
The ability to dynamically change the Btrfs disk pool is really fascinating, and particularly for a home user, gives you the most flexibility to buy and use disks as needed. I can’t stress again the usefulness that Btrfs lets you mix disks of different size. This is possible because RAID in Btrfs is not quite the same as the conventional RAID, but instead works on chucks and distributes/allocates chunks across devices.
All things considered, Btrfs looks like a good choice for a DIY scalable NAS.
For those interested, my current setup is a Linux PC with a biggish casing that can take quite a number of drives (7 internal bays, 2 5.25″ full-height bays, and one 3.5″ hot swap bay). I use a 256 GB SSD as a boot drive with Ubuntu 13.10 installed in it. The drive bays are populated with 2x 3 TB drives, 2x 2 TB drives, and one older 1 TB drive. There is space to grow, but unfortunately there isn’t enough SATA ports onboard. I will have to look for a SATA PCIe card.
I’ll save the Btrfs setup for another post, another day.
What about other option such as unraid (http://lime-technology.com/) ?