... which was inspired by Mike Rubel's web page.
Let's say, I've got two hard drives. First is $HOME, second is mounted at /mnt/hitachi (happened to be a name of my second backup HDD).
First, create initial snapshot by copying all your files from $HOME to /mnt/hitachi:
#!/bin/bash DIRNAME=/mnt/hitachi/bak-$(date +%Y%m%d-%H%M%S)/ rsync --delete -av $HOME/ $DIRNAME
That creates a copy of your $HOME in, say, /mnt/hitachi/bak-20191014-121855.
Now create hard-links for all the files in backup. "ls -t" is used to find newest directory under /mnt/hitachi. "cp -al" creates hard-links instead of copying files as whole objects.
#!/bin/bash path=/mnt/hitachi DIRNAME=$path/$(ls -t $path | head -n1) NEWDIRNAME=/mnt/hitachi/bak-$(date +%Y%m%d-%H%M%S) cp -al $DIRNAME $NEWDIRNAME touch $NEWDIRNAME
This creates a second snapshot, say, /mnt/hitachi/bak-20191015-211458. The second snapshot is like a virtual copy. Each file physically stored only once. But there are now two hard-links from both snapshots.
Now update (or populate) the second (or newest) snapshot:
#!/bin/bash path=/mnt/hitachi DIRNAME=$(ls -t $path | head -n1) rsync -av $HOME/ $path/$DIRNAME/ --delete --ignore-errors
This scripts copies files absent in the last snapshot or deletes hard-link. (A physical file is being deleted if no hard-links points to it.)
Run the second and the third script once a day or more often...
Now you have a set of snapshots. This is like preserving a history of a file or a directory. Like VCS does. Or like Wikipedia tracks history of each article.
If your second backup HDD has no more free space, just delete oldest snapshot(s). Again, this will not delete files still linked in the newer snapshots.
Why snapshots are important? You can delete a file couple of days ago AND made backup. In snapshot-style backup, the file may still be accessible if you have a snapshot made BEFORE file deletion. Also, this can be a solution against ransomware malware, if your backup system made backup of already encrypted files, older snapshot can still be protected...
Also, nice feature of Windows: https://en.wikipedia.org/wiki/Shadow_Copy.
Script to delete oldest snapshot:
#!/bin/bash path=/mnt/hitachi echo Before trimming: df --output=source,target,avail,pcent $path | grep -v Mounted DIRNAME=$(ls -tr $path | head -n1) echo Going to trim $path/$DIRNAME # -I doesn't work as I wanted, because rm asking for removing write-protected files... read -p "Press enter to actually delete the files or Ctrl-C" rm -rf $path/$DIRNAME echo After trimming: df --output=source,target,avail,pcent $path | grep -v Mounted
However, you can just largest file in oldest snapshot, right? And even more, only files that linked only once, i.e., stored only in one snapshot. (Killing files linked from several snapshots is senseless, you can't buy much space by this operation.) In my case, these are old VM images...
find . -size +1G -type f -links 1 -print
And if you're sure:
find . -size +1G -type f -links 1 -delete
Thus, old snapshots are not full copies (largest files can be missted). But, your backup system now can store much more snapshots (or history) for smaller files.
Date: Thu, 17 Oct 2019 08:25:07 +0300 From: Ciprian Dorin Craciun <ciprian(dot)craciun(at)gmail.com> I've read you article 'Simplest possible snapshot-style backups using rsync' (https://yurichev.com/blog/bak/), and found it interesting. (Personally I use 'rdiff-backup'.) ... Also please note that you can use just 'rsync' to obtain the same results (without using the extra 'cp') by using '--compare-dest' or '--link-dest'. Moreover it would be useful to point out that if one changes a file in a "snapshot", and it is linked, then all the other snapshots are also affected. Perhaps a 'chattr' with setting immutable could be useful after backup to make sure this doesn't happen; but before backups that immutable flag must be removed so that hardlinks can be made. Thanks for the article, Ciprian.
→ [list of blog posts]Please drop me email about any bug(s) and suggestion(s): dennis(@)yurichev.com.