in Hacking

rsync output compressor

Rsync is a very nice tool for automating remote backups. (Specially in combination with daily snapshots (like zfs snapshot) ).
Like many others I have automated the process of running rsync on a daily basis via a cron job. Cron nicely sends me an email with the output of the rsync command.

I usually use the -v option so I can see what files have been changed. This worked nicely several years ago when I didn't have much changes on my server. But nowadays I often receive e-mails of 10 MB or larger. That's not very useful.

Removing the -v option is an option, but I don't see anything anymore (perhaps a total summary).

To solve this problem I've hacked together an rsync-output-compressor script :)
You can find it on https://github.com/gamecreature/rsync-output-compressor

This scripts summarizes the output of rsync -v based on a given rules file. You can specify what files/folders should be explicitly mentioned and what folders/files should be grouped together.

This little script is written for ruby 1.9 and higher.

An example

For example let's view the following output: (... = many more lines)

rsync -avz --delete user@example.com:/data /backups/remote_data
receiving incremental file list
/home/emma/public_html/important_file.txt
/home/emma/public_html/important_file1.txt
/home/emma/public_html/important_file2.txt
/home/emma/public_html/important_file3.txt
...
/home/sarah/public_html/index.html
/home/sarah/public_html/images/
...
/home/david/private/special_file.txt
/home/david/public_html/downloads/new_download.zip

Using the following filter: (compress-rules.txt)

/home/*/public_html/

results in the following output:

rsync -avz --delete user@example.com:/data /backups/remote_data | rsync-output-compressor.rb --rules compress-rules.txt
receiving incremental file list
   123    -5 /home/emma/public_html/
    40       /home/sarah/public_html/
     1       /home/david/private/special_file.txt
     2       /home/david/public_html/

The column with positive values are changed/added files and the column with negative values are the number of deleted files.

The tool has several other options like storing the original full output to an external location (option -f).

Using this script my daily emails have been reduced from 10MB to 30KB :)
And I still know what is happening with my backup.

Feel free to use and improve this little script!