Total Pageviews

Tuesday, 15 March 2016

automatically rsync from os x to linux

All of my data is stored on my NAS, from where it is automatically backed up daily. But doing photo-editing on a remote file was slow, especially over WiFi and/or VPN. So I decided to store all photo’s locally, but without loosing the automatic backups. I solved this problem with a Launchd agent to watch the directory for changes (and run every hour anyway), and rsync for the actual transfer.
Additional challanges were that user permissions needed to be synced across as well. (Usernames did match on both machines, but UIDs did not)

The Launchd agent

cron has been deprecated under Mac OS X for a while now, instead, launchd does that job (and more). So I wrote a very minimalistic launch agent:
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE plist PUBLIC "-//Apple Computer//DTD PLIST 1.0//EN"
<plist version="1.0">
This plist was placed in /Library/LaunchDaemons/be.dest-unreach.sync-photos.plist, so it would run as root (in order to copy the file ownerships and permissions). Next, either reboot the machine, or run launchctl load /Library/LaunchDaemons/be.dest-unreach.sync-photos.plist to get it loaded.
This agent will run every hour, and aditionally when the /Users/Shared/Lightroom folder is modified. Unfortunately, the watch is not recursive, but since Lightroom uses a SQLite database that it touches all the time, this was enough for me.

Rsync security

Since file ownerships and permissions need to be maintained, rsync needs to run as root on both systems. This is a severe security risk, since I don’t want a compromised iMac to have full access to the NAS. To solve this, I’ve created a seperate unprivileged backup-account on the NAS. The iMac will SSH in to that account, and then connect to the running rsync-daemon (which has root-privileges). That way, a compromised iMac only has unprivileged access to the NAS (which can be severely limited).
The rsyncd.conf looks like this:
# This line is required by the /etc/init.d/rsyncd script
pid file = /var/run/

use chroot = yes
read only = no
hosts allow =
#port = 873

    comment = Lightroom sync path from iMac
    list = no
    path = /mnt/data/lightroom
    charset = UTF-8
    read only = no
    uid = 0
    gid = 0
    auth users = lightroom
    secrets file = /etc/rsyncd.secrets
I’ve added an additional password to access this module as a precaution.

Connecting to the rsync-daemon

To connect to the rsync-daemon, there are some hoops to jump through: First an SSH-connection with a port forward, and then connect through there… Luckily, rsync really is a great program, and has built-in support for this scenario:
RSYNC_CONNECT_PROG="ssh -F .ssh/config %H socat - TCP:localhost:873" rsync <opts> . rsync://lightroom@remotehost/lightroom/.
You can add SSH-specific configuration either in the environment variable, or in the referred ssh-config file, like I did:
Compression no
User backup
IdentityFile /Users/Shared/.ssh/backup.id_rsa
UserKnownHostsFile /Users/Shared/.ssh/known_hosts
(Note that the SSH-username and the rsync-username don’t have to be the same)


Unfortunately, I found rsync syncing over the same files again and again. The problem was that HFS+ uses UTF-8‘s NFD form (it decomposes accented characters), while Linux doesn’t (it accepts either form, and simply stores it). So rsync systematically deleted the decomposed filenames, and copied over the composed ones, only to have HFS+ store them decomposed again.
The solution is rather simple: tell rsync abouth the conversion with the --iconv option. Unfortunately, the built-in rsync of OS X Maverick is quite old, and doesn’t support the iconv-option. Luckily, HomeBrew has an up-to-date one:
$ brew tap homebrew/dupes
$ brew install rsync
On the server-side, you need to specify the charset option, so rsyncd will accept iconv from clients。