Earlier this year, I shared about my plans to build a storage server. I had been quite excited about Btrfs. Btrfs has very nice features, particularly attractive to home users needing flexibility in growing, shrinking, or reassigning their disk drives. I’ve been trying out Btrfs on my data, and it’s time to share with you some findings. Unfortunately, it is bad news.
Sadly, Btrfs has consistently corrupted my data. I’ve given up on it. That’s my short verdict. If there’s anything to take away from this experience, this is it.
Here’s the details. To test Btrfs, I subjected my storage to some failure scenarios. I kept losing data. I’m the kind, if I want to test something, I’ll thoroughly stress test. I will purposely force disconnect the disk, directly write random data to the disk, etc. Btrfs failed.
In fact, I was sorely disappointed that even after reducing the test scenario into something extremely basic, something that one would ordinarily expect to work perfectly, Btrfs continued to disappoint. This is what I did:
- Create a Btrfs volume out of 3 physical disks, with RAID1 profile for both metadata and data storage.
- Run a scrub. Passed without errors.
- Copy 1.2TB of my real data into the disk.
- Run a scrub. Passed without errors.
- Run a balance.
- Run a scrub again. Passed without errors.
- Now, remove a disk by using the prescribed Btrfs command. E.g.,
$ btrfs device delete /dev/sdb /nas - Run a scrub. Failed with too many unrecoverable checksum errors.
I repeated that. Three times. Each time, I ended up with extensive checksum errors. My disks are good. I tested them prior.
I did do a simple test on three 8GB file-based virtual devices forming part of a Btrfs volume. With far less data, I could trash Btrfs anyway I liked, of course within its design parameters, and the Btrfs volume came away fine.
But it just wouldn’t work with my real data set that’s some 1.2TB big. It is really disappointing. I cannot understand why others have not had any trouble with Btrfs. I would really have loved Btrfs to work out.
I’m now moving on to consider another option: GlusterFS. GlusterFS is designed to be a scale-out network-attached storage file system. A “problem” with Btrfs is that it doesn’t provide for resiliency across server and/or site. This means that a *touch wood* catastrophic incident such as a fire will completely take out the Btrfs volume. You’d still need a copy of the data somewhere else, in another server, preferably in another location.
With GlusterFS to be effective in providing for redundancy, one would need to have multiple servers. I’m thinking about my Raspberry Pis being those multiple servers. They are slow. But the network connecting the far-away Raspberry Pis are likely to be slow anyway. It is, however, a little less straight-forward (not complicated, just not something all that trivial) to pool together multiple disks of the same server without complicating the GlusterFS assurance of filesystem resiliency. I’ll update when I’ve got the chance to properly explore GlusterFS.
Wouldn’t a raid1 requiere 2 or 4 disks. Or how did you do your raid config ?
That’s with typical RAID1 which works at disk level. In Btrfs (similar in others like ZFS), the redundancy is provided within the filesystem, not at disk level. So in order RAID1 configuration, Btrfs will ensure that each chuck of data is always written into two different hard disk. In a three disk system it means sometimes a data chunk is written to disk A and B, another time it could be B and C, and yet other times it could be A and C.