If you are worried like me about data degradation on your document/archive NAS it might be worthwhile looking into SnapRAID. SnapRAID is a backup solution for disk arrays. It creates hashes on a extra disk to ensure data integrity and gives you a additional recover option for the failed array disk. As an extra benefit you can recover specific files from accidental deletion and verify the current data set. I am using this mostly for the last option because it gives a extra layer of safety on those important files I don’t want to see get corrupted without taking a 1 on 1 backup.
Let’s have a look at the sales talk from the SnapRAID site:
- All your data is hashed to ensure data integrity and to avoid silent corruption.
- If the failed disks are too many to allow a recovery, you lose the data only on the failed disks. All the data in the other disks is safe.
- If you accidentally delete some files in a disk, you can recover them.
- You can start with already filled disks.
- The disks can have different sizes.
- You can add disks at any time.
- It doesn’t lock-in your data. You can stop using SnapRAID at any time without the need to reformat or move data.
- To access a file, a single disk needs to spin, saving power and producing less noise.
All in all not a bad feature list. SnapRAID can be found on GitHub but compiled packages exist for Fedora and Centos (via EPEL).
One downside I have found is that it can take ages to collect all hashes and the hashes aren’t striped. SnapRAID will also not help if the machine goes up in flames making this a good RAID backup solution but not a full data backup solution.
SnapRAID does allow you to verify all present data in the RAID with the scrub command. When a scrub is started it will compare the hashes it stored with the data on the disks and write an error to a file if a diff is found. You can then replace the file from the hash or update the hash.
For more info you should check out the manual here.