Backup and Restore

Just had a bit of a struggle doing a backup and test restore onto a new server (so I could try the upgrade to 4.0). The internet is full of people suggesting quick one-liners to backup docker containers yet none of them seemed to work.

First attempt was the manual dump and restore. Having copied docker-compose and aleph.env to the new server, I started up the stack, and then did
something like

cat ../backups/latest/postgres-data/dump_2023-05-24-21-05.sql | docker-compose exec -T postgres psql -U aleph -d aleph
# Copy other files back into place
docker cp ../backups/latest/archive-data/ aleph_ingest-file_1:/data/
docker cp ../backups/latest/redis-data/ aleph_redis_1:/data/

Nothing showed immediately, so I restarted the stack. Still no data. I then tried

docker-compose run --rm shell aleph upgrade
docker-compose run --rm shell aleph reindex-full

As those have rescued me in the past. But they just produced errors, which I’d never seen before.

So I moved onto the second method of exporting the docker volumes to a tar file, and then reimporting them. The general method was eg.

 docker run --rm --volumes-from dbstore -v $(pwd):/backup ubuntu tar cvf /backup/backup.tar /dbdata
# and then to restore
 docker run --rm --volumes-from dbstore2 -v $(pwd):/backup ubuntu bash -c "cd /dbdata && tar xvf /backup/backup.tar --strip 1"

These also created an empty data set, and errors when reindexing and upgrading.

So then I moved onto the brute force sysadmin method. I stopped aleph on both sides, and then just rsynced the /var/lib/docker/volumes/aleph* directories over. When I started it up on the target computer, all was good.

So my question is first of all, how are others managing their backup / restore process. I can’t believe, with the profusion of docker everywhere, that its so hard to backup, and using the last solution seems a little indelicate. Am I missing a step? Is the brute force rsync method OK for production use? I’d love to hear other peoples thoughts.

Still interested to hear what is working for people backing up their aleph installations. I understand its not a case of having a single strategy that will work for everyone, but I think it would be helpful to have some tried and tested starting points.

Also, and I guess most importantly is my currently working method of syncing docker volumes trustworthy, valid, and workable?

Hey, in general:

Aleph deployment via docker should always use mounted / bind volumes from the host system, not docker volumes “inside” the containers, as docker would use a new volume for a new container (depending on the configuration in the compose file, though)

Then, backup these paths, no docker needed for that, just rsync the files (or another backup tool of your choice).

Only the file archive and the full sql dump is needed to restore a full Aleph instance, as the data from the followthemoney store (in sql) would be used to re-index all the data back into (a new) elasticsearch.

Of course, depending on the overall size of the data, more advanced strategies should be considered:

  • Use not file system for archive, instead a distributed bucket deployment (AWS, GCS, self hosted Min.io) with replicas
  • Use psql replication for failovers
  • Use elasticsearch replication for failovers

For smaller instances, “just” syncing the volumes should work. I even rsync’ed the elasticsearch volume once and it worked, but that’s definetly not the way elasticsearch should be backed up and restored in a production environment :slight_smile:

Thanks for that input Simon. I’ve been running and backing up servers for a couple of decades, but as soon as you introduce docker into the mix it seems to complicate things!

I’ve been following various guides and recommendations around the internet about dumping and restoring postgresql. I’ve never had issues with this before, when using the non-docker version, but for some reason, dumps are refusing to restore inside of docker.

I have gathered there’s a bit difference between restoring a dump through pg_restore and restoring through pgsql, and using the -c flag and not, and plain vs custom dump formats, but haven’t found a winning combination yet. So that’s where I hit upon using the docker volume copy method, which seemed to work.

Will now try your suggestions via bind mounts, which I think I’d prefer anyway, coming from a sysadmin background.

The aleph install is not large (or busy), and easily contained on the existing server, so I think that should be the way forward.

1 Like