Automated and painless backup with rdiff-backup

This article will show you how to setup and configure rdiff-backup to backup a set of specified directories on computer A in an automated, painless and secure way to computer B. The backup process will be initiated from computer B. This means that this computer doesn’t have to be online 24/7. It also means that is relatively easy to make one computer a hub, backing up several servers.

In my setup I use my workstation to backup my server. So, for the sake of convenience, from here on the computer being backed up (computer A) will be called server and the computer where the backup is saved (computer B) will be called workstation (shortened ws). Both my server and workstation are running Debian GNU/Linux. The server is running the stable release (sarge) and the workstation unstable (sid). This means that if you’re using something else, you’ll have to figure out the installation part on your own. However, the rest should work equally well on any distribution.

Table of contents

Installing rdiff-backup

First we have to install rdiff-backup. On workstation, this is simple.

Since we want the same version of rdiff-backup on both computers, we will install rdiff-backup from backports.org on server. We add the following lines to /etc/apt/sources.list.

To make sure only rdiff-backup is taken from backports.org, we pin all packages to a priority below the standard priority and then explicit pin rdiff-backup to a higher one. We do this by adding the following lines to /etc/apt/preferences.

Then there’s just the standard commands.

Configure ssh to work with rdiff-backup

Much of the this section has been taken from Dean Gaudet’s excellent howto: Unattended rdiff-backup. All credit should go to him. I’ve just made small changes.

To do the backup we’ll use a local, non-root user on workstation. I’ve decided to use the pre-existing user backup. It has the following entries in /etc/passwd, /etc/shadow and /etc/group.

If you don’t have that user, feel free to use another one. Or create a backup user with this command.

In my case, the user backup didn’t have write permission to /var/backups. I therefor created the two directories that the user will need as root. It’s the .ssh directory to store ssh config files and rdiff-backups to store, well, the rdiff backups.

Next we’ll create a ssh key so that backup@workstation can log in as root@server unattended (without entering a password) to perform the backup. If you think this seems a bit insecure, don’t worry. We’ll make it secure later on.

To be able to use the newly generated ssh key together with rdiff-backup, we’ll need to create a ssh alias. We will call the alias backup-server and this is done by adding the following lines to /var/backups/.ssh/config (make sure you edit the file as the backup user).

If you wish, you can add other options like Compression yes, Cipher blowfish etc as well (man ssh_config for a list of options). Personally I’ve added Port 222 since the ssh service on my server is running on port 222. This way I avoid all brute-force hacking attempts. It works a lot better than other, more elaborate attempts to block unwanted connections.

The next thing on our agenda is making sure root can log in on server. The file /etc/ssh/sshd_config on server should contain the following lines (among others).

If not, add or change existing lines and reload the server configuration.

Then, back to workstation. Now we need to copy the public ssh key to server. We do this by using the ssh-copy-id command.

Or, if you are like me and are running the ssh service on port 222, use this command.

Now backup@workstation should be able to log in as root@server without entering a password. Since this is the first time, you might have to verify and accept the host key.

If it didn’t work, go back and check that you’ve done everyting as described. Please contact me if you are unable to figure out what’s wrong. Maybe I can be of assistant.

Before we proceed, it’s time to tighten the system a bit. Personally, I don’t like allowing unlimited root logins. Especially not with a ssh key without a password. So, while logged in as root on server, edit the file /root/.ssh/authorized_keys. It should contain a row with the exact same contents as the file /var/backups/.ssh/id_rsa_backup_server.pub on workstation and look similar to this (I’ve shortened the key, yours will be longer)

We’ll prepend that line with this.

The complete line will then look something like this.

Make sure that the line is not broken up in several lines, because then it wont work. Also remember to change from="workstation" to your real hostname. More information about this can be found in sshd’s man page.

Note: Even though this makes the system more secure, you should keep the private key (/var/backups/.ssh/id_rsa_backup_server on workstation) secret. Anybody on workstation that gets a hold of that key will be able to read (but not write) any file on server.

The last step in our ssh configuration is to edit /etc/ssh/sshd_config again and change PermitRootLogin yes to PermitRootLogin forced-commands-only. This will make sure that root login is only allowed with public key authentication and only if the command option is set. Don’t forget to reload the configuration.

Now, let us test the setup to see if it works.

If you get a password prompt, then something is wrong. It’s probably a problem with the file /root/.ssh/authorized_keys on server. Go back and make sure you’ve done everything as described. Check the file /var/log/auth.log on server for any errors. If you’re still unable to get it to work, contact me and maybe we can find the problem.

Configuring rdiff-backup

If you are interested in backing up everyting on server, the only command you need to know is

(Remove –print-statistics if you want it to be silent.)

Me, I’m only interested in backing up a few specific directories. To start with, I would like to backup my webpages. They’re located in /var/www and /home/www (I would prefer to have everything in /home/www, but suEXEC makes that impossible). So I create the file /var/backups/rdiff-backups/backup-list-server with this contents

To start the backup, issue the command

If you have more “regular” directories or files you wish to backup, just append the path to the backup-list-server file and rerun the command. Every time you wish do perform a backup, you just run that command. Later on we’ll automate this, but first, I would like to discuss a couple of “special” directories.

Subversion repositories

The first thing you should consider before taking backup of subversion repositories are if they are using the BDB or FSFS backend. If they are using the BDB backend, you’re on your own. Because then you can’t take a backup directly. Instead you’ll have to use “svnadmin hotcopy <repopath> <backuppath>” to create a consistent copy of your repository and then add <backuppath> to the filelist. I urge you to instead convert them to the FSFS backend.

Don’t know if you’re using BDB or FSFS? The following command will tell you.

If the output is “fsfs” you’re using FSFS, otherwise it’s BDB.

So, now that you are using FSFS it’s simply a question of adding these lines to the filelist (assuming your repositories are in /home/svn).

The first line will make sure that no ongoing transactions are backed up.

Note: Even though you’re using FSFS, it’s still possible to get a backup that’s inconsistent. But, and this is important, you don’t get inconsistent data. The only file that might get inconsistent is db/current. Since this is a small text file (three numbers, including current revision) it’s possible to restore it by hand or with a script in the unlikely event it’s inconsistent.

If it where possible to specify an order in which files got backed up, it would be possible to make the repository completely consistent by making sure db/current was the first file that was backed up (this is how svnadmin hotcopy does it).

In my opinion it’s not worth the strain on the server that a svadmin hotcopy gives since a new hotcopy has to be done before a backup can take place. If you’d like more information, take a look at the end of this file.

MySQL databases

MySQL databases have much the same problems as Subversion repositories with the BDB backend: they need to be hotcopied to be in a consistent state. To cope with this I’ve created the script mysql-backup that you can download to your server. Edit the script and changed the path if you have to, try it out and if it works then add this (assuming you saved the script as /root/bin/mysql-backup to /etc/cron.d/backup (on the server)

This will make a hotcopy of every database (except test) to the configured backup directory (default /var/backups/mysql). Then it’s just a question of adding /var/backups/mysql to the filelist and voila, consistent backups.

If the script fails it might be because you don’t have the root password saved in ~root/.my.cnf. Add the following to the file:

Maildirs

Backing up a live maildir (while both the SMTP-server and POP/IMAP-server access it) should not be a problem since all mails are stored in separate files and delivering new mail, deleting old etc are supposed to be atomic. What can be a problem however, is the index files used by the POP/IMAP-server. But since those are recreated if they don’t exist (at least if you use Dovecot), we’ll just exclude them from the backup.

So, if your mails are located in /home/mail and you use Dovecot, the filelist will contain.

If you read the document mail-storages.txt that comes with Dovecot you might object to excluding all imap indices. But, as can be read on the Dovecot wiki, the document is out of date.

Final filelist

The complete contents of the file /var/backups/rdiff-backups/backup-list-server follows.

Automate the backup

Now it’s time to automate the backup. On workstation, create the file /etc/cron.d/rdiff-backup with the following contents.

If you decided to backup the whole disk, rdiff-backup should instead contain this.

Observe that the command should be on one line. I’ve added line breaks just to make it look better.

Verify the backup

You should of course verify the backup from time to time. A quick way to do it is to do like this.

And then check that the contents in /var/backups/rdiff-backups/restore is what’s expected.

Delete old backup

Once in a while, for example after you’ve backed up the backup directory to e.g. CD-R, you can delete some old backup to save space. To delete backups older than four weeks, issue this command.

Alternative you might choose to keep the last 20 rdiff-backup sessions.

Restore from the backup

To be written. Until then, you can look at some examples.

More information

4 thoughts on “Automated and painless backup with rdiff-backup

  1. Thanks for this great tutorial!

    A comment about one bit I found confusing: in the text, you state that a local, non-root user (e.g. backup) does the backup on the workstation. The workstation is, of course, the host with the data to be backed up. I think you mean that it’s root that does the backup on the workstation, and ‘backup’ that runs on the server.

    Thanks

  2. Glad you liked it!

    I can see that it can be somewhat confusing, but in my case I’m actually doing a backup of my server to my workstation. The server is the host with the data to be backed up so root@server and backup@workstation is correct.

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code class="" title="" data-url=""> <del datetime=""> <em> <i> <q cite=""> <strike> <strong> <pre class="" title="" data-url=""> <span class="" title="" data-url="">