Total Pageviews

Tuesday, 9 July 2013

Automated encrypted backups with Duplicity

Backups are about the most important thing about managing a server. If anything happens to your server, like data loss, a customer making a mistake (and accidentally removing data) or a full-blown server crash, you’ll be happy to have backups. There’s plenty of people that don’t make them though. For some it may feel to complicated, for some it may feel like to much work and others may just not care.

I’m going to show you how to easily make (and restore) encrypted incremental backups with Duplicity. Duplicity is a tool that uses tar, librsync and GnuPG to backup up data. All your data is encrypted, so nobody else can look at it (except when they have your GnuPG private key and password). While Duplicity is _officially_ still in beta, it’s been considered stable enough by many (including me) for the past couple of years. In addition to that, it powers the default backup tool in Ubuntu (Deja Dup, which is built on Duplicity). Duplicity support the following protocols for connecting to a remote filesystem: ssh/scp, local file access, rsync, ftp, HSI, WebDAV, Tahoe-LAFS, and Amazon S3.
In this guide, I’m going to assume we’re backup up to a backup server using SFTP. All examples should work on both Ubuntu (or other Debian-bases distributions) and CentOS (or RHEL/Fedora).
# Installing duplicity
First things first. Let’s install duplicity. It’s simple enough. On Ubuntu:
sudo apt-get install duplicity
On CentOS (you need EPEL for this):
yum install duplicity python-paramiko
Done.

Creating a GnuPG key

Duplicity both encrypts and signs your backups with your GnuPG key. Your backups are encrypted for privacy and security and signed for detecting changes (and thus being able to do incremental backups). Creating a GnuPG key is pretty simple. You do need GnuPG installed, though. On Ubuntu the package is named ‘gnupg’ (sudo apt-get install gnupg) and on CentOS ‘gnupg2′ (yum install gnupg2). In my experience GnuPG is almost always present, even on minimal installations.
Let’s get started and create our key. Run this as the user that’s making your backups (usually root):
gpg –gen-key
This will ask you several questions. It starts with the key type and size:
What kind of key you want: ‘RSA and RSA’
What keysize you want: 4096
The key default key type is fine and most secure (RSA with an RSA subkey). For key length, I always go for 4096 bits these days, as it’s the most secure option.
How long the key must be valid: 0 (infinity)
If you really want a non-expiring key: Y
Usually, I would not recommend a non-expiring key. However, if you want your backups to last, you’re better off with this option. An expired GnuPG key cannot be used for encryption, just for decryption. So from the moment of expiration, all new backups would be encrypted and signed with a different key. This is really undesirable, because it will break the incremental chain and thus, history. The alternative would be re-encrypting and re-signing your backups every x years (for example) because your key has expired. If you have terabytes of backups, this is not something you want to do.
Let’s go on with identification for the key. I usually give they key the name “Duplicity Backup”. That way you can always identify the proper key, but feel free to give it another name. The e-mail address is completely up to you as well, as is the comment. These name, e-mail address and comment won’t be used for anything other than identifying your key in a list of keys. Confirm the information at the end.
Real name: “Duplicity Backup”
Email address: operations@example.net
Comment: (none)
Confirmation: O for Okay
Finally, it will ask you for a passphrase (twice). Pick a very secure on here, as (combined with your GnuPG private key) it is the gateway to your data:
Passphrase: <your very secure passphrase>
Repeat passphrase: <your very secure passphrase>
After that, GnuPG needs to generate random bytes and it needs entropy for that. On my desktop this was not an issue, but on an idling server it proofed challenging. Just run a lot of DD tests or do some heavy stuff in a different terminal and you should be fine. When done, GnuPG gives you information about the key you’ve just created:
gpg: /root/.gnupg/trustdb.gpg: trustdb created
gpg: key 1731EA9E marked as ultimately trusted
public and secret key created and signed.
gpg: checking the trustdb
gpg: 3 marginal(s) needed, 1 complete(s) needed, PGP trust model
gpg: depth: 0 valid: 1 signed: 0 trust: 0-, 0q, 0n, 0m, 0f, 1u
pub 4096R/1731EA9E 2013-05-21
Key fingerprint = 867B 4675 2693 CA7C 34D9 E394 FDF1 0274 1731 EA9E
uid Duplicity Backup <operations@example.net>
sub 4096R/79FA3B1E 2013-05-21
The key ID that it lists before ‘marked as ultimately trusted’ on the second like is very important. That’s the public key ID and we will use it to tell Duplicity which key to use for encryption and signing. In this case its: 1731EA9E.
# Creating backups
Now that we’ve got our key ready, we can start making backups! What we’re going to do is back up a folder of files to a remote filesystem, using SFTP, and then list the files to see what has happened. As an example, I’m going to backup my mission-critical wallpaper collection.
From the directory which contains the folder that you are going to back up, run:
duplicity –encrypt-key=1731EA9E –sign-key=1731EA9E mission-critical-wallpapers sftp://backupserver.example.net/mission-critical-wallpapers
What we’ve done here, is tell duplicity to use our recently-created key for encrypting (–encrypt-key) and signing (–sign-key) the backups. In addition to that, we’ve told it which folder on the local filesystem to use (mission-critical-wallpapers) and where to put in on which remote filesystem (sftp://backupserver.example.net/mission-critical-wallpapers). The path on the remote filesystem is relative to your home directory on the remote filesystem. When executed, it starts running Duplicity and will ask you for your GnuPG key password twice (for both encryption and signing):
Local and Remote metadata are synchronized, no sync needed.
Last full backup date: none
GnuPG passphrase:
GnuPG passphrase for signing key:
No signatures found, switching to full backup.
————–[ Backup Statistics ]————–
StartTime 1369202758.42 (Wed May 22 08:05:58 2013)
EndTime 1369202758.82 (Wed May 22 08:05:58 2013)
ElapsedTime 0.40 (0.40 seconds)
SourceFiles 10
SourceFileSize 6895171 (6.58 MB)
NewFiles 10
NewFileSize 6895171 (6.58 MB)
DeletedFiles 0
ChangedFiles 0
ChangedFileSize 0 (0 bytes)
ChangedDeltaSize 0 (0 bytes)
DeltaEntries 10
RawDeltaSize 6891075 (6.57 MB)
TotalDestinationSizeChange 6809236 (6.49 MB)
Errors 0
————————————————-
Lucky for me, my collection of mission-critical wallpapers is not that large. However, if you have a lot to back up, now is the time to grab some coffee (and do something else while you’re at it). The initial backup can take quite a while, but the incremental backups afterwards will be a lot faster. Duplicity doesn’t display any progress, so after typing your password it just runs. To see some progress, go to the remote filesystem and ‘watch du -sh’ the target folder folder. This will show you how large the target folder is.
Incremental backups are pretty smart. Duplicity only backs up what has changed and keeps track of it. So you will always be able to restore a full and complete copy of your data, exactly the way it was the last time you’ve backed it up (or another moment in time). So if you add one wallpaper to that folder, it will back up just that one wallpaper. When restoring the backup completely, it will include that one wallpaper as well. You can also choose to restore an older version of your backup, before that wallpaper was made. Let’s add a wallpaper to the folder and run duplicity again:
duplicity –encrypt-key=1731EA9E –sign-key=1731EA9E mission-critical-wallpapers sftp://user@backupserver.example.net/mission-critical-wallpapers
The output is now different:
Local and Remote metadata are synchronized, no sync needed.
Last full backup date: Wed May 22 08:05:50 2013
GnuPG passphrase:
GnuPG passphrase for signing key:
————–[ Backup Statistics ]————–
StartTime 1369202811.55 (Wed May 22 08:06:51 2013)
EndTime 1369202811.99 (Wed May 22 08:06:51 2013)
ElapsedTime 0.44 (0.44 seconds)
SourceFiles 18
SourceFileSize 14414631 (13.7 MB)
NewFiles 9
NewFileSize 7523556 (7.18 MB)
DeletedFiles 0
ChangedFiles 0
ChangedFileSize 0 (0 bytes)
ChangedDeltaSize 0 (0 bytes)
DeltaEntries 9
RawDeltaSize 7519460 (7.17 MB)
TotalDestinationSizeChange 7456505 (7.11 MB)
Errors 0
————————————————-
Compared to the statistics on the initial run, you now see an increased SourceFileSize. The NewFiles and NewFileSize gives an overview on what has changed. These statistics can be actually quite useful to see if you backup has succeeded.
Congratulations! You’ve just made your first backup with Duplicity! Now let’s restore it…

Restoring backups

Now that we’ve backed up our files they are safe, we are going to restore the backup to see if the backup has actually worked. Before we’re going to restore them, let’s have a look what is in the backup:
duplicity list-current-files sftp://user@backupserver.example.net/mission-critical-wallpapers
This produces a list of the files in the backup (the most recent version). It does this based solely on the signature files, so it doesn’t have to download or access the complete backup. You’ll get a list that looks like:
Local and Remote metadata are synchronized, no sync needed.
Last full backup date: Wed May 22 08:05:50 2013
Wed May 22 08:06:40 2013 .
Wed May 22 08:04:13 2013 02140_romanbath_2560x1600.jpg
Wed May 22 08:04:14 2013 02141_auroraborealis_2560x1600.jpg
Wed May 22 08:04:13 2013 02142_lakechapalajaliscomexico_2560x1600.jpg
Wed May 22 08:04:13 2013 02143_sonicboom_2560x1600.jpg
Wed May 22 08:04:13 2013 02144_islate_2560x1600.jpg
Wed May 22 08:04:13 2013 02145_powerfulsunrise_2560x1600.jpg
Wed May 22 08:04:13 2013 02146_newyork_2560x1600.jpg
Wed May 22 08:04:12 2013 02147_thefence_2560x1600.jpg
Wed May 22 08:04:14 2013 02148_sunsetintuscanysaturdaymay23rd2009_2560x1600.jpg
Wed May 22 08:06:40 2013 02330_morainelakepanorama_2560x1600.jpg
Wed May 22 08:06:40 2013 02333_firstlight_2560x1600.jpg
Wed May 22 08:06:40 2013 02334_portomoniz89s_2560x1600.jpg
Wed May 22 08:06:40 2013 02335_alpsteinbeforerain_2560x1600.jpg
Wed May 22 08:06:40 2013 02336_leavesatlynncanyonpark_2560x1600.jpg
Wed May 22 08:06:40 2013 02337_seasonofillusions_1920x1200.jpg
Wed May 22 08:06:40 2013 02338_yellowstonesunset_2560x1600.jpg
Wed May 22 08:06:40 2013 02339_thetimeofsilence_2560x1600.jpg
Now, that looks good! Let’s restore these backups:
duplicity restore sftp://user@backupserver.example.net/mission-critical-wallpapers restored-mission-critical-wallpapers
Duplicity will now restore your backups to the local path you’ve given. Once completed, it will give you a summary of the run:
Local and Remote metadata are synchronized, no sync needed.
Last full backup date: Wed May 22 08:05:50 2013
GnuPG passphrase:
And you’re done!

Final note

I’ve now shown you how to do a basic backup and restore with Duplicity. However, it has a lot more options than I’ve just shown. In addition to that, you may want to automate your backups. I’ll go into detail about that in a next article, as it’s a subject on its own. For now, happy back-upping.
----------------------------------------------------------------

Automated encrypted backups with Duplicity


a tutorial on how to use duplicity to make encrypted, incremental backups over various available protocols. The process was a manual one, though, and would always require you to type the actual command and wait for the run to finish. It also had you manually keep track of what you did back up and what you didn’t back up. It was basically a nice and easy way to back up several folders, more or less archiving them in case you ever need them again.
This week I’m going to share a script with you that I’ve used for a quite some time. It is based upon a script I found over at the Linode forums which was written to backup to Amazon S3. All I’ve done is strip some of the unnecessary stuff intended for S3 and changed some of the includes/excludes. The credits for the script thus goes to the original author. I’m going to explain the script to you, though, and let you know how you can tweak it to suit your needs.

There are several things you need for the script to work:
  1. A GnuPG key (see my previous tutorial on how to generate one)
  2. The password of your GnuPG key (you should know this)
  3. The public key ID of your GnuPG key (see my previous tutorial)
  4. A backup server accessible over SSH/SFTP
  5. Duplicity installed (see my previous tutorial)
I would really recommend using a dedicated GnuPG key for this script, meaning you don’t use the GnuPG key you use for this script for anything else. So when you’ve gathered all that, let’s get started with the script!

The Script

This is the full script. It may look long or complicated right now, but I’m going to go through it line by line. The script will work on various Linux distributions. I’ve tested in on Ubuntu and CentOS.
#!/bin/bash
trace () {
stamp=`date +%Y-%m-%d_%H:%M:%S`
echo “$stamp: $*” >> /var/log/backup.log
}
# Export your GnuPG passphrase to an ENV variable so you don’t have to type it every time
export PASSPHRASE=<GnuPG passphrase>
# Identifier for your GnuPG key
GPG_KEY=<GnuPG key identifier>
# Backups older than this will be removed
OLDER_THAN=”6M”
# The source of your backup, often a local directory
SOURCE=/
# The destination (relative to the home directory of the user you’re logging in as)
DEST=”sftp://<path>”
# Check if a full backup is necessary
FULL=
if [ $(date +%d) -eq 1 ]; then
FULL=full
fi;
trace “Backup for local filesystem started”
trace “… removing old backups”
# Comment this line (and the one above to keep the log clean) to disable backup removal
duplicity remove-older-than ${OLDER_THAN} ${DEST} >> /var/log/backup.log 2>&1
trace “… backing up filesystem”
# Full backup run
duplicity \
${FULL} \
–encrypt-key=${GPG_KEY} \
–sign-key=${GPG_KEY} \
–include=/etc \
–include=/home \
–include=/root \
–exclude=/var/tmp \
–include=/var \
–exclude=/** \
${SOURCE} ${DEST} >> /var/log/backup.log 2>&1
# And we’re done
trace “Backup for local filesystem complete”
trace “————————————”
# Reset the ENV variable
export PASSPHRASE=
It’s a bash script, meaning there’s little requirements but a recent shell. If you’re not running a distribution’s release that’s older than five years, you should definitely be fine. The first line indicates that it’s a bash script:
#!/bin/bash
The next line creates a function called ‘trace’. This function enables easy logging to a log file. It prepends every log line with a timestamp. The log file is in /var/log and it’s called ‘backup.log’. This function is used several times in the script, as you will notice.
trace () {
stamp=`date +%Y-%m-%d_%H:%M:%S`
echo “$stamp: $*” >> /var/log/backup.log
}
The next line exports your GnuPG passphrase as an environment variable, meaning it’s available to every application that runs under the same user as the script. We do this to make sure duplicity doesn’t ask for a password when it runs, enabling you to run this with cron (I’ll show you that in a bit).
# Export your GnuPG passphrase to an ENV variable so you don’t have to type it every time
export PASSPHRASE=<GnuPG passphrase>
Not to worry, we’ll reset environment variable when the script is done, so it’ll be out of the environment.
Next, we’ll assign your GnuPG key ID to a variable (this is the key duplicity will use for signing and encryption):
# Identifier for your GnuPG key
GPG_KEY=<GnuPG key identifier>
We set the threshold for backup removal (I’ll get back to this a couple of lines down) to 6 months. Feel free to change it to suit your needs:
# Backups older than this will be removed
OLDER_THAN=”6M”
Set the source of the backup, in this case the root directory of your server:
# The source of your backup, often a local directory
SOURCE=/
And the destination, which is a path relative to the home directory of the user you’re logging in as on the remote server:
# The destination (relative to the home directory of the user you’re logging in as)
DEST=”sftp://backups.example.net/server1backups”
And with the final variable we’ll determine whether we need a full backup or not:
# Check if a full backup is necessary
FULL=
if [ $(date +%d) -eq 1 ]; then
FULL=full
fi;
What the above does, is set the FULL variable to ‘full’ when it’s the first day of the month. This variable is used when creating the actual backup. If it’s set to ‘full’, duplicity makes a complete backup. Otherwise, it does an incremental backup. Leaving it like this means you get a full backup every first day of the month and incremental backups to that one on all the other days. You could also skip the three lines that set FULL to ‘full’ to never make a full backup but the first time and then just have incremental backups from that point on.
After having set all variables, the script starts by adding information to the log file:
trace “Backup for local filesystem started”
trace “… removing old backups”
This calls the trace function with the quoted text as an argument. That text is thne appended to the log.
Next, it’s removal time:
# Comment this line (and the one above to keep the log clean) to disable backup removal
duplicity remove-older-than ${OLDER_THAN} ${DEST} >> /var/log/backup.log 2>&1
Remember above, when we set the OLDER_THAN variable? This is where it is used. This command uses standard duplicity functionality to remove backups older than a number of X, where X can be months or weeks or even years. It does so at the destination you pass to it. If you do not want old backups to be removed, comment this line like this:
# Comment this line (and the one above to keep the log clean) to disable backup removal
# duplicity remove-older-than ${OLDER_THAN} ${DEST} >> /var/log/backup.log 2>&1
And it will skip the removal of backups.
Now it’s time for the actual backup process:
trace “… backing up filesystem”
# Full backup run
duplicity \
${FULL} \
–encrypt-key=${GPG_KEY} \
–sign-key=${GPG_KEY} \
–include=/etc \
–include=/home \
–include=/root \
–exclude=/var/tmp \
–include=/var \
–exclude=/** \
${SOURCE} ${DEST} >> /var/log/backup.log 2>&1
It starts with a trace to indicate the actual backup has started. Next, it’s on to the duplicity command. This is basically a very standard duplicity command but with a few additional parameters to include and exclude certain paths. The following line adds the value of the FULL variable to the command, which can be empty or ‘full’. In the case of ‘full’, duplicity does a full backup. Otherwise, it does an incremental one.
${FULL} \
The encrypt-key and sign-key parameters use the GPG_KEY you’ve set above. These are used to encrypt and sign the data.
The following couple of lines include and exclude several directories. The backslash (\) at the end is use to indicate the command continues on a new line.
–include=/etc \
–include=/home \
–include=/root \
–exclude=/var/tmp \
–include=/var \
–exclude=/** \
What we do here, is include the etc, home and root directories. We then exclude /var/temp but include var. With –exclude=/** we exclude everything else.
A couple of things: if you want to exclude a subdirectory of a directory you are including, you have to do it before including the directory the subdirectory is in. Otherwise it will back it up anyway. The same logic is applied the other way around: if you want to include a subdirectory of a directory you are excluding, you have to do it before excluding the directory the subdirectory is in. You also have to include everything you do want to back up before you are excluding everything else (–exclude=/**). A different approach could be that you exclude a couple of directories like /tmp and /var/tmp (or even /var/cache) and then just include everything else (–include=/**) . That would look like this:
–include=/tmp/secretfiles \
–exclude=/tmp \
–exclude=/var/cache \
–exclude=/var/tmp \
–include=/** \
The above includes everything except /tmp (it does include /tmp/secretfiles), /var/cache and /var/tmp.
Feel free to tweak the includes and excludes to suit your needs and be sure to check the result of the backup before you go to sleep ;-) The log file in /var/log/backup can help you with that.
The backup command ends with:
${SOURCE} ${DEST} >> /var/log/backup.log 2>&1
Which uses the SOURCE and DEST variables defined before as the source and destination for the duplicity command. It then outputs everything that comes from the command to the log file.
Finally, we end the backup script by writing something to the log and resetting the ENV variable with your GnuPG passphrase:
# And we’re done
trace “Backup for local filesystem complete”
trace “————————————”
# Reset the ENV variable
export PASSPHRASE=
After having done this, you GnuPG passphrase is no longer “out there”.
So, that was the script! Not all that hard if you think about it. But let’s get it to run automatically!

Getting it up and running

Now we’ve got the script all finished, let’s add it to cron. Cron is a time-based job scheduler. You can let it run whatever command you want at whatever time (or time interval) you want. We’re going to make this backup run every night at 01.00 (1 AM).
First, make sure the file is on the server. I always prefer to put these in /opt, but in this case /root is fine as well. Since the script contains the password for the GnuPG key, it’s better to not have it visible to other users. /root is protected from outsiders by default, so I’m going to use that in my example. Upload the script to /root and name it ‘duplicity-backup’ (without an extension).
Then, on either CentOS or Ubuntu, run (only use sudo if you’re not logged in a root):
(sudo) crontab -e
This will open up a text editor with the crontab for root. Crontab contains the lines of the jobs cron has to execute. There are other ways to add cron jobs, but for now this is easiest. Add the following line at the end of the file:
0 1 * * *  /root/duplicity-backup
The first five characters are to indicate when the job should be run. The first number is the minute (0), the second number is the hour (1), the third the day of the month (* for any), the fourth the month (* for any) and the fifth the year (* for any). It then ends with the command to run. This job will thus run any year, in any month, on any day at 01.00 (1 AM). Now save the file and you’re all set!
From now on, your backups should work. It would be good to check it the first couple of days, see if it all runs well before relying on them.

Final notes

There are other ways to handle backups. For instance, you could make a python script that does the above and more (like dump databases, package certain files). Using duplicity, though, is something that I can really recommend for whatever language you decide to write your backup script in. I can also recommend to backup the GnuPG key to a secure location (a USB stick in a vault or something), as that key is the key to your backups.
Happy back-upping.

from http://www.lowendbox.com/blog/automated-encrypted-backups-with-duplicity/