Automated and painless backup with rdiff-backup
This article will show you how to setup and configure rdiff-backup to backup a set of specified directories on computer A in an automated, painless and secure way to computer B. The backup process will be initiated from computer B. This means that this computer doesn’t have to be online 24/7. It also means that is relatively easy to make one computer a hub, backing up several servers.
In my setup I use my workstation to backup my server. So, for the sake of convenience, from here on the computer being backed up (computer A) will be called server and the computer where the backup is saved (computer B) will be called workstation (shortened ws). Both my server and workstation are running Debian GNU/Linux. The server is running the stable release (sarge) and the workstation unstable (sid). This means that if you’re using something else, you’ll have to figure out the installation part on your own. However, the rest should work equally well on any distribution.
Table of contents
- Installing rdiff-backup
- Configure ssh to work with rdiff-backup
- Configuring rdiff-backup
- Automate the backup
- Verify the backup
- Delete old backup
- More information
First we have to install rdiff-backup. On workstation, this is simple.
root@ws% apt-get install rdiff-backup
Since we want the same version of rdiff-backup on both computers, we will install rdiff-backup from backports.org on server. We add the following lines to
# Backports deb http://www.backports.org/debian/ sarge-backports main
To make sure only rdiff-backup is taken from backports.org, we pin all packages to a priority below the standard priority and then explicit pin rdiff-backup to a higher one. We do this by adding the following lines to
Package: * Pin: release a=sarge-backports Pin-Priority: 200 Package: rdiff-backup Pin: release a=sarge-backports Pin-Priority: 990
Then there’s just the standard commands.
root@server% apt-get update root@server% apt-get install rdiff-backup
Configure ssh to work with rdiff-backup
To do the backup we’ll use a local, non-root user on workstation. I’ve decided to use the pre-existing user backup. It has the following entries in
passwd: backup:x:34:34:backup:/var/backups:/bin/sh shadow: backup:*:13050:0:99999:7::: group: backup:x:34:
If you don’t have that user, feel free to use another one. Or create a backup user with this command.
root@ws% adduser --system --home /var/backups --shell /bin/sh --group --disabled-login backup
In my case, the user backup didn’t have write permission to
/var/backups. I therefor created the two directories that the user will need as root. It’s the .ssh directory to store ssh config files and rdiff-backups to store, well, the rdiff backups.
root@ws% cd /var/backups root@ws% mkdir --mode=700 .ssh rdiff-backups root@ws% chown backup.backup .ssh rdiff-backups
Next we’ll create a ssh key so that backup@workstation can log in as root@server unattended (without entering a password) to perform the backup. If you think this seems a bit insecure, don’t worry. We’ll make it secure later on.
root@ws% su - backup backup@ws% ssh-keygen -t rsa -N '' -f /var/backups/.ssh/id_rsa_backup_server
To be able to use the newly generated ssh key together with rdiff-backup, we’ll need to create a ssh alias. We will call the alias backup-server and this is done by adding the following lines to
/var/backups/.ssh/config (make sure you edit the file as the backup user).
Host backup-server Hostname server User root Identityfile /var/backups/.ssh/id_rsa_backup_server Protocol 2
If you wish, you can add other options like
Cipher blowfish etc as well (man ssh_config for a list of options). Personally I’ve added
Port 222 since the ssh service on my server is running on port 222. This way I avoid all brute-force hacking attempts. It works a lot better than other, more elaborate attempts to block unwanted connections.
The next thing on our agenda is making sure root can log in on server. The file
/etc/ssh/sshd_config on server should contain the following lines (among others).
PermitRootLogin yes PubkeyAuthentication yes
If not, add or change existing lines and reload the server configuration.
root@server% invoke-rc.d ssh reload
Then, back to workstation. Now we need to copy the public ssh key to server. We do this by using the ssh-copy-id command.
root@ws% ssh-copy-id -i /var/backups/.ssh/id_rsa_backup_server.pub root@server
Or, if you are like me and are running the ssh service on port 222, use this command.
root@ws% ssh-copy-id -i /var/backups/.ssh/id_rsa_backup_server.pub "-p 222 root@server"
Now backup@workstation should be able to log in as root@server without entering a password. Since this is the first time, you might have to verify and accept the host key.
root@ws% su - backup backup@ws% ssh backup-server root@server% id uid=0(root) gid=0(root) groups=0(root)
If it didn’t work, go back and check that you’ve done everyting as described. Please contact me if you are unable to figure out what’s wrong. Maybe I can be of assistant.
Before we proceed, it’s time to tighten the system a bit. Personally, I don’t like allowing unlimited root logins. Especially not with a ssh key without a password. So, while logged in as root on server, edit the file
/root/.ssh/authorized_keys. It should contain a row with the exact same contents as the file
/var/backups/.ssh/id_rsa_backup_server.pub on workstation and look similar to this (I’ve shortened the key, yours will be longer)
ssh-rsa AAAA[...] backup@workstation
We’ll prepend that line with this.
command="rdiff-backup --server --restrict-read-only /",from="workstation",no-port-forwarding,no-X11-forwarding,no-agent-forwarding,no-pty
The complete line will then look something like this.
command="rdiff-backup --server --restrict-read-only /",from="workstation",no-port-forwarding,no-X11-forwarding,no-agent-forwarding,no-pty ssh-rsa AAAA[...] backup@workstation
Make sure that the line is not broken up in several lines, because then it wont work. Also remember to change
from="workstation" to your real hostname. More information about this can be found in sshd’s man page.
Note: Even though this makes the system more secure, you should keep the private key (
/var/backups/.ssh/id_rsa_backup_server on workstation) secret. Anybody on workstation that gets a hold of that key will be able to read (but not write) any file on server.
The last step in our ssh configuration is to edit
/etc/ssh/sshd_config again and change
PermitRootLogin yes to
PermitRootLogin forced-commands-only. This will make sure that root login is only allowed with public key authentication and only if the command option is set. Don’t forget to reload the configuration.
root@server% invoke-rc.d ssh reload
Now, let us test the setup to see if it works.
backup@ws% rdiff-backup --test-server backup-server::/ Testing server started by: ssh -C backup-server rdiff-backup --server Server OK
If you get a password prompt, then something is wrong. It’s probably a problem with the file
/root/.ssh/authorized_keys on server. Go back and make sure you’ve done everything as described. Check the file
/var/log/auth.log on server for any errors. If you’re still unable to get it to work, contact me and maybe we can find the problem.
If you are interested in backing up everyting on server, the only command you need to know is
backup@ws% rdiff-backup --print-statistics \ backup-server::/ /var/backups/rdiff-backups/backup-server
(Remove –print-statistics if you want it to be silent.)
Me, I’m only interested in backing up a few specific directories. To start with, I would like to backup my webpages. They’re located in
/home/www (I would prefer to have everything in
/home/www, but suEXEC makes that impossible). So I create the file
/var/backups/rdiff-backups/backup-list-server with this contents
To start the backup, issue the command
backup@ws% cd /var/backups/rdiff-backups backup@ws% rdiff-backup --print-statistics --include-globbing-filelist \ backup-list-server --exclude / backup-server::/ backup-server
If you have more “regular” directories or files you wish to backup, just append the path to the
backup-list-server file and rerun the command. Every time you wish do perform a backup, you just run that command. Later on we’ll automate this, but first, I would like to discuss a couple of “special” directories.
The first thing you should consider before taking backup of subversion repositories are if they are using the BDB or FSFS backend. If they are using the BDB backend, you’re on your own. Because then you can’t take a backup directly. Instead you’ll have to use “svnadmin hotcopy <repopath> <backuppath>” to create a consistent copy of your repository and then add <backuppath> to the filelist. I urge you to instead convert them to the FSFS backend.
Don’t know if you’re using BDB or FSFS? The following command will tell you.
user@server% cat <repopath>/db/fs-type
If the output is “fsfs” you’re using FSFS, otherwise it’s BDB.
So, now that you are using FSFS it’s simply a question of adding these lines to the filelist (assuming your repositories are in
- /home/svn/*/db/transactions /home/svn
The first line will make sure that no ongoing transactions are backed up.
Note: Even though you’re using FSFS, it’s still possible to get a backup that’s inconsistent. But, and this is important, you don’t get inconsistent data. The only file that might get inconsistent is db/current. Since this is a small text file (three numbers, including current revision) it’s possible to restore it by hand or with a script in the unlikely event it’s inconsistent.
If it where possible to specify an order in which files got backed up, it would be possible to make the repository completely consistent by making sure db/current was the first file that was backed up (this is how svnadmin hotcopy does it).
In my opinion it’s not worth the strain on the server that a svadmin hotcopy gives since a new hotcopy has to be done before a backup can take place. If you’d like more information, take a look at the end of this file.
MySQL databases have much the same problems as Subversion repositories with the BDB backend: they need to be hotcopied to be in a consistent state. To cope with this I’ve created the script mysql-backup that you can download to your server. Edit the script and changed the path if you have to, try it out and if it works then add this (assuming you saved the script as
/etc/cron.d/backup (on the server)
# 04:50 every Monday 50 4 * * 1 root /root/bin/mysql-backup
This will make a hotcopy of every database (except test) to the configured backup directory (default
/var/backups/mysql). Then it’s just a question of adding
/var/backups/mysql to the filelist and voila, consistent backups.
If the script fails it might be because you don’t have the root password saved in
~root/.my.cnf. Add the following to the file:
Backing up a live maildir (while both the SMTP-server and POP/IMAP-server access it) should not be a problem since all mails are stored in separate files and delivering new mail, deleting old etc are supposed to be atomic. What can be a problem however, is the index files used by the POP/IMAP-server. But since those are recreated if they don’t exist (at least if you use Dovecot), we’ll just exclude them from the backup.
So, if your mails are located in
/home/mail and you use Dovecot, the filelist will contain.
- /home/mail/**/.imap.index* /home/mail
If you read the document
mail-storages.txt that comes with Dovecot you might object to excluding all imap indices. But, as can be read on the Dovecot wiki, the document is out of date.
The complete contents of the file
/var/www /home/www - /home/svn/*/db/transactions /home/svn /var/backups/mysql - /home/mail/**/.imap.index* /home/mail
Automate the backup
Now it’s time to automate the backup. On workstation, create the file
/etc/cron.d/rdiff-backup with the following contents.
MAILTO="root" # Backup server every Tuesday and Friday at 12:15 15 12 * * tue,fri backup rdiff-backup --print-statistics --include-globbing-filelist /var/backups/rdiff-backups/backup-list-server --exclude / backup-server::/ /var/backups/rdiff-backups/backup-server
If you decided to backup the whole disk, rdiff-backup should instead contain this.
MAILTO="root" # Backup server every Tuesday and Friday at 12:15 15 12 * * tue,fri backup rdiff-backup --print-statistics backup-server::/ /var/backups/rdiff-backups/backup-server
Observe that the command should be on one line. I’ve added line breaks just to make it look better.
Verify the backup
You should of course verify the backup from time to time. A quick way to do it is to do like this.
backup@ws% cd /var/backups/rdiff-backups backup@ws% mkdir restore backup@ws% rdiff-backup --restore-as-of now backup-server restore
And then check that the contents in
/var/backups/rdiff-backups/restore is what’s expected.
Delete old backup
Once in a while, for example after you’ve backed up the backup directory to e.g. CD-R, you can delete some old backup to save space. To delete backups older than four weeks, issue this command.
backup@ws% rdiff-backup --remove-older-than 4W /var/backups/rdiff-backups/backup-server
Alternative you might choose to keep the last 20 rdiff-backup sessions.
backup@ws% rdiff-backup --remove-older-than 20B /var/backups/rdiff-backups/backup-server
Restore from the backup
To be written. Until then, you can look at some examples.