Backing up a ZFS pool to a Raspberry Pi with syncoid
Recently the NAS at my parents home broke, and I needed a quick alternative to take off-site backups of a server. Since that NAS was only there for me when I was younger, and I couldn’t visit them to replace it, I opted to upgrade the Pi4 that was still there doing nothing to a ZFS replication device. I prepared an SD card with Ubuntu server for the Raspberry Pi and told my parents to replace the SD card in the Pi. Then I logged in and installed it remotely.
Step 1: Setting up ZFS
First, we’ll need to get the ZFS tools installed:
sudo apt install zfsutils-linux
Then, we make a pool called rpool
representing the whole vdev, and we add a dataset named rpool/backup
that used lz4 compression. The command below is for a single disk. If you have multiple disks, you may want to make a mirror vdev (like I will probably do after Covid-19). Lookup what configuration will work best for you. There is a more elaborate tutorial on how to set up a ZFS pool on the ubuntu website.
sudo zpool create -f -o ashift=12 -O acltype=posixacl -O xattr=sa rpool "$DISK_ID"
sudo zfs create rpool/backup
sudo zfs set compression=lz4 rpool/backup
Step 2: Put the Pi’s SSH key on the remote server
We will pull in snapshots over SSH. To this end, we add the SSH key of the root user on the pi to the root user’s authorized keys file on the remote server.
Note that we do not do it the other way around, we want our backup server to reach out to the remote server and pull in snapshots. We do not want the remote server to push snapshots to the pi, because if this server gets compromised it should not have access the backup server (to wipe it).
It is advised to always pull from the backup server, and ensure that you harden the backup server’s security: only accessible over VPN, the bare minimum of running services, ….
Step 3: Install Syncoid
To pull in snapshots from the remote machine, we will use Syncoid. This is a tool bundled with the policy-driven ZFS snapshot management package Sanoid. To install it, use:
sudo apt install sanoid
Syncoid facilitates incremental replication of ZFS datasets.
Example with one dataset
# ___________ source ___________ _______target ______
sudo syncoid root@remote:rpool/data/mydataset rpool/backup/mydataset
- The above snippet creates a snapshot named “syncoid_ubuntu_YYYY-MM-DD-HH:MM:SS” for
rpool/data/mydataset
on the remote server. - Rolls back
rpool/backup/mydataset
to the latest common snapshot between the target and the source. - Incrementally receives that “syncoid_ubuntu_YYYY-MM-DD-HH:MM:SS” snapshot and all earlier snapshots to our target
rpool/backup/mydataset
dataset (the pi4). - Removes older “syncoid_ubuntu_YYYY-MM-DD-HH:MM:SS”-like snapshots on both the source and the target (keeping the latest one).
Working with multiple datasets recursively
We want Syncoid to incrementally receive all the snapshots for all datasets under /rpool/data
on the remote machine to /rpool/data
on the pi. To do this we use:
sudo syncoid -R --skip-parent --no-rollback root@remote:rpool/data rpool/backup
Let’s go over the flags:
-R
indicates that we want to recursively visit all datasets underroot@remote:rpool/data
--skip-parent
ensures that we do not receiveroot@remote:rpool/data
itself (I don’t have anything in there)--no-rollback
prevents Syncoid from rolling back snapshots on the target machine (the pi).
The --no-rollback
flag ensures that Syncoid does not delete snapshots but only adds them. If you mess with the snapshots on your remote machine (e.g. by doing a rollback), ZFS receive will not be able to receive the latest snapshot. Syncoid will continue with the other datasets and then exit with exit code 2. To still get the changes regardless, you can remove this flag, then Syncoid will look for the latest common snapshot, and it will roll back target to that snapshot and then do the receive.
Because this is a backup system, we do not want to roll back in case an adversary rolls back the ZFS filesystem on our server.
Step 4: Automate and monitor
To not forget to run Syncoid, we will use a SystemD timer. As a general rule you should ensure that you monitor your automated backups actively, If you don’t and something silently goes wrong, you will not notice it until it is too late. Have your monitoring service constantly check the freshness of your backups (active). Do not only rely on the backup server to send you mail when something went wrong (passive monitoring).
Create a file /opt/syncoid-pull/syncoid-pull
owned by root and only writeable and executable by root that calls syncoid and sends off metrics for monitoring:
#!/usr/bin/env bash
set -xeuo pipefail # failfast and be verbose
syncoid -R --skip-parent --no-rollback --debug root@remote:rpool/data rpool/backup
zfs list -Hpo creation,name,used -t snapshot -r rpool/backup -s creation |\
sed 's:\t\([^@]*\)@:\t\1\t:' |\
column -J --table-columns creation,dataset,name,size -s $'\t' --table-name 'snapshots' |\
ssh root@remote 'cat > /var/www/status/backups.json'
The last command in this file sends a JSON of snapshots to a file on the remote machine to be monitored. Replace the last line with something that interacts with your monitoring system.
Automate a call to the script above during the off-peak hour with these SystemD unit and timer files:
/etc/systemd/system/syncoid-pull.service
:
[Unit]
Description=syncoid pull
Requires=local-fs.target
After=local-fs.target
[Service]
Type=oneshot
ExecStart=/opt/syncoid-pull/syncoid-pull
WorkingDirectory=/opt/syncoid-pull/
/etc/systemd/system/syncoid-pull.timer
:
[Unit]
Description=syncoid pull every night
[Timer]
OnCalendar=04:37:41
Persistent=true
[Install]
WantedBy=timers.target
sudo systemctl enable syncoid-pull.timer && sudo systemctl start syncoid-pull.timer
Extra: Delete old snapshots
To preserve some space on the pi, I want to remove snapshots older than 2 months automatically. To do this, I added the equivalent of the following snippet to my syncoid-pull
script.
now=$(date +%s)
zfs list -Hpo creation,name -t snapshot -r rpool/backup \
| grep $'\t''rpool/backup/[^@]*@zfs-auto-snap_' \
| tac \
| while read -r creation snapshot; do
if (( ( $now - $creation ) > 60 * 60 * 24 * 30 * 2 )); then
zfs destroy $snapshot;
fi;
done
It finds all snapshots in rpool/backup
and deletes them once they reach teh age of two months. The tac
reverses the list which makes it faster. This action should be performed before sending the list of snapshots to monitoring.
Other tips and tricks
- Attach a M.2 SATA SSD to USB 3.0 External SSD enclosure to your pi.
- Do regular ZFS scrubs (
zpool scrub rpool
) to check the health of your pool and send the health status to monitoring. - Use logcheck to send mails to you when a service fails.
- Don’t boot your Pi from the SD card