Backup, Disaster Recovery and High Availability¶
Introduction¶
Backup, Disaster Recovery (DR) and High Availability (HA) go hand in hand and are typical topics that are not well planned for SMB and home use.
- High Availability: Ensure that the services continue running in case of component failure
- Backup: Ensuring that you never lose data
- Disaster Recovery: Ensure that you can recover from a fatal error
A goal for TAPPaaS is to ensure all three topics are delivered out of the box.
High Availability¶
TAPPaaS provides options for managing the following high availability scenarios:
- Hard disk failure
- General hardware failure
- Reboots and reconfiguration of TAPPaaS nodes
- Overloaded services
- Internet failure
Hard Disk Redundancy¶
The Proxmox and associated ZFS file system allow TAPPaaS to be configured with Mirror or RAIDz½ redundancy.
- TAPPaaS separates between important and non-important services, deployed on two different ZFS datapools (tanka and tankb). This way you can reserve the hard disk redundancy to the services that are high in importance
- TAPPaaS does not enforce a particular redundancy level for datapools, but recommends mirror for tanka and no redundancy for tankb
Cluster Setup¶
TAPPaaS supports setting Proxmox up in a cluster with 3 or more nodes. It is not a requirement but recommended for anything but a small setup.
If TAPPaaS is configured on 3 or more nodes then each of the high priority services are setup with a default failover node, and regular snapshot transfer is configured.
Internet Failure¶
It is the intent to test the OPNsense setup and associated caching recursive DNS and ensure that the local TAPPaaS ecosystem continues to function when there is an internet outage.
Backup Strategy¶
We follow the 3-2-1 backup design principle for TAPPaaS:
- Have 3 copies of data
- Have 2 different formats of backup
- Have 1 backup in a remote location
Design Principles¶
The primary design principle for the TAPPaaS backup strategy is that every configuration and all user data is located inside the VMs that host the services. The TAPPaaS instance configuration itself is located inside the TAPPaaS CICD VM.
Proxmox Backup Server (PBS)¶
Every TAPPaaS system should have a local Proxmox Backup Server. The PBS runs on a regular interval (default is daily). With compression and deduplication it can keep:
- Daily backup for 7 days
- Weekly for 4 weeks
- Monthly for one year
- Yearly forever
This ensures you can go back in time in case of a long running hacking attempt or if an accidental delete is not discovered immediately.
Remote Backup¶
The second layer of backup ties the local PBS to a remote PBS backup service using the PBS replication feature. Any TAPPaaS system can act as a PBS for another TAPPaaS system. Backups are encrypted by default, ensuring that you do not need to trust the remote TAPPaaS operator.
Personal Backup¶
The final aspect of backup allows individual users to create their own data backup on a detachable media (typically a USB HD). This data is stored in the native format of the individual applications. This last backup also allows a user to leave a TAPPaaS system without losing their data.
Disaster Recovery¶
There are generally 4 types of disasters to deal with:
- Hardware failure (including environmental failures like power spikes or fire)
- Software updates that go wrong
- Hackers infiltrate and destroy, encrypt or generally make the system unreliable
- The user (or Administrator) accidentally deletes data or disrupts the working installation
Recovery Methods¶
TAPPaaS operates with 3 kinds of disaster recovery methods:
- Rebuild from backup - Restore the system from local or remote backup
- Rent space on another TAPPaaS system - Re-establish services from backup in separate VLANs
- Cloud recovery - Rent VPCs in a cloud provider and re-establish VMs