ACM Active-Passive Hub for Disaster Recovery
Through this post we are going to learn how to configure an active-passive hub cluster configuration, where the initial hub cluster backs up data and one, or more passive hub clusters are on stand-by. When the active cluster becomes unavailable the passive hub become active restoring passive data and active data to control the managed clusters.
- OCP v4.6 or above
- ACM v2.6
- OADP v1.0
- S3 Object Storage
The original files contained in backup and restore folders can be found here.
How to do
1) Once the Advanced Cluster Management for Kubernetes
operator is installed in RHOCP, it is necessary to create a multiclusterhub
instance.
2) When the state of multiclusterhub
instance is phase:available
, it is possible to enable the backup configuation modifying the YAML
definition:
- enabled: true
name: cluster-backup
3) The ACM operator will reconcilie and start the instalation of the OADP operator in the namespace open-cluster-management-backup
. The OADP operator is installed in v1.0, wich is linked to Velero v1.7, check the Velero Version Relationship.
NOTE: Do not change the OADP operator channel to v1.1, the Velero pod is not able to connect to S3 storage. See this Knowledge Base Article.
- The activation of the
cluster-backup
will also provide a policy, you can find the policy in the Governance section under the ACM UI console.
4) Create and configure an S3 Object Storage where you want to save all the buckups.
5) Create the secret in the same namespace that ACM has installed the OADP operator:
$ oc create secret generic cloud-credentials -n open-cluster-management-backup --from-file cloud=credentials-velero
6) Now, create a Data Protection Application instance in the OADP operator. Change the region
, bucket
and prefix
labels accordingly with your environment.
$ oc apply -f https://raw.githubusercontent.com/jtovarro/rhacm-active-passive-hub/main/oadp-operator/data-protection-application.yaml
7) Configure your first buckup in the active hub.
$ oc apply -f https://raw.githubusercontent.com/jtovarro/rhacm-active-passive-hub/main/backup/backup-schedule.yaml
8) In the passive hub cluster we will need to install the same operators with the same configurations as in our active hub. As well, it is necessary to install the ACM operator with the cluster-buckup
label enabled, and then create the Data Protection Applicantion
instance linked to the same S3 Object Storage
where the buckups from the current active hub are pointing, as explained in previous steps.
- There are two kinds of data to restore:
- Passive data: secrets, ConfigMaps, apps, policies and all the managed cluster custom resources.
- Activation data: results in managed clusters being actively managed by the cluster when it is restored on a new hub cluster.
9) In the context of a failure in the active hub, we have the chance to recover our data in the passive hub, with the passive hub going to an activate status. We can also make restores only with passive data and apply the restore of the activation data as last step.
- Apply this yaml if you want to set synchronized restores for passive data:
$ oc apply -f https://raw.githubusercontent.com/jtovarro/active-pasive-hub-cluster/main/restore/restore-passive-sync.yaml
NOTE: make sure the active hub is power-off in case you want to restore the activation data in the passive hub, if not the active hub will try to add the managed clusters back again.
- Apply this yaml if you want to restore the activation data as well as the passive data:
$ oc apply -f https://raw.githubusercontent.com/jtovarro/active-pasive-hub-cluster/main/restore/restore.yaml
NOTE: for ACM v2.6 only clusters created through HIVE API will be added to the passive hub automatically, imported clusters will have a ‘pending’ status and need to be added manually. For ACM v2.7 both HIVE API and imported clusters will be added to the passive hub cluster automatically when a activation restore take place. See more about ACM v2.7 here.
10) If the old hub
becomes available again delete the backupschedule
object and the managedcluster
objects so this hub cluster now is available as passive hub.
Related Links
[1] Backup and Restore documentation