With the release of ZFS on Solaris 10, I sat down and marveled at the opportunities for off-site backups. I have already written a bit about ZFS detailing why I think it kicks so much ass. With zfs send and zfs receive, one can manage block-level incremental backups and restores. What's missing? An elegant hack leveraging that to provide a simple and reliable backup infrastructure for a network of ZFS capable machines (including Mac OS X and FreeBSD now, BTW).
So, I sat down and wrote Zetaback -- which is currently 1032 lines of perl code (including complete documentations) plus a thin agent on remote machines that is 290 lines of perl code (including complete documentation). I'd like to note that the only reason there is documentation, let alone complete documentation, is because of Eric Sproul. This really demonstrates to me that "Keep It Simple Stupid" still works for important tasks.
Zetaback is a rather full features backup and restore system. It can manage multiple hosts, multiple ZFS per host, both frequency and retention policies on full and incremental backups. It can report policy violators (things that haven't been backed up within the policy). It can manage the archiving of backups. It provides both non-interactive and interactive restores. It has an excellent command line syntax. And most importantly, it has saved my ass more times than I can count.
I'm not usually big on awards... I find the single unexpected email from someone saying: "damn that was useful, thanks!" to be more gratifying most of the time. However, Zetaback was one of the first projects we put up on labs, so being a 3rd place winner in the OpenSolaris Community Innovation Awards is pretty exciting.
Friday, September 19. 2008 at 10:08 (Link) (Reply)
When I click on the links in your article (when reading it via RSS in Google Reader) I get taken to some exit.php page. If I open the article in a new tab and click the links they work.
Thought you'd like to know.
Friday, September 19. 2008 at 13:25 (Link) (Reply)
Theo,
Congratulations on "Zetaback" winning 3rd price in the OpenSolaris Community Innovation awards.
Jignesh
Saturday, September 20. 2008 at 18:38 (Link) (Reply)
Congratulations!
Your backup solution stores the snapshots backedup in directories? Or the stream is received and imported on the Master node?
I thinks your solutions seems really fine, but send is more reliable using a receive in the other hand.
Saturday, September 20. 2008 at 22:37 (Link) (Reply)
Marcelo:
The sends are stored (optionally compressed) as plain files in a directory for each remote host. They are not received on the remote end. This is actually more flexible. It means that the host running zetaback could be any OS (not just one supporting ZFS). The advantage of receiving them is making them accessible on the backup host (FS exploration). Right now we do incrementals of a fixed base, we'd have to start doing them only of the last snapshot.
Suffice it to say that storing them as a file instead of as a ZFS filesystem was thoroughly deliberated decision.
Wednesday, September 24. 2008 at 20:56 (Link) (Reply)
If you are backing them up as files, that's quote clever. Probably without much work it would be simply to also store those files on Amazon S3.