
Backups for a database running at the edge

Craig Nielsen
November 20, 2025
High Level Setup
what defines an edge? is this used for in cluster db or for edge setups? why do you need to backup? and using spilo, in a cluster, why does it mske sense? possibly mention that it should be on a different node at leadt but ideally in another dc. mention how kubrrnetes csn run across regions snd you just have to desl with the delay, but afding a node is as simple as *add command here.
Backups
- the database is running in the cluster
- the backups use wal-e configured through the patroni cluster setup on the pod
- the schedule is setup as part of the postgres-cluster config
- the backups are done locally, to an incluster object storage (MinIO)
- there is a cron job that syncs these local backups to azure for DR
Recovery
- locally can be fetched from the local object storage
- in DR scenario, there is a clone "restore_job" Kubernetes JOB that is run
- it contains a once off clone from azure to the local object storage
- from this local storage the postgres cluster can be restored
Drawbacks
- of course there is potential here for issues in the cluster with large volumne of data
- data is duplicated locally on disk at the moment since there is 1 node. To fix this, the idea is deploy new nodes to join the cluster, Object storage will live on another node using taints and tolleration labels from kubenetes to control this. This helps you recover from local node issues.
- In the case of DR, you can setup a new cluster, sync locally.
Benefits
- faster recovery from node issues
- DR support for local issues using Azure
- Cheaper that any online DB service
- Locally you can extend the database nodes and create dedicated nodes, naturally extending the database size as your application scales.