Backup and restore etcd of a kubernetes cluster

Etcd is distributed key-value database. It’s designed to store small amounts of data that can be stored in memory.
It supports high availability and widely used in kubernetes clusters.
In this post we will describe how to backup and restore the data stored in the etcd database in a context of a kubernetes cluster.

Resources
https://etcd.io/
https://github.com/etcd-io/etcd/releases/tag/v3.5.0
https://kubernetes.io/docs/tasks/administer-cluster/configure-upgrade-etcd/#restoring-an-etcd-cluster

Note: this procedure is for demonstrating purpose on a cluster created with kubeadm (one Control Plane and two Workers), depending on your setup this should be tailored on your specific case!

Download and install etcd containing the server and utilities on your Control Plane, it would make sense to use the same version as you have installed, in my case will be v3.5.0.

wget https://storage.googleapis.com/etcd/v3.5.0/etcd-v3.5.0-linux-amd64.tar.gz
tar -zxvf etcd-v3.5.0-linux-amd64.tar.gz
sudo cp etcd-v3.5.0-linux-amd64/etcdctl /usr/bin/
sudo cp etcd-v3.5.0-linux-amd64/etcdutl /usr/bin/

Test etcdctl by getting the namespaces, edit your endpoint, in my case as I have only a Control Plane etcd is listening only on 127.0.0.1

ETCDCTL_API=3 sudo etcdctl get /registry/namespaces --prefix --keys-only --endpoints https://127.0.0.1:2379 --cert=/etc/kubernetes/pki/etcd/server.crt --key=/etc/kubernetes/pki/etcd/server.key --cacert=/etc/kubernetes/pki/etcd/ca.crt
/registry/namespaces/default

/registry/namespaces/home-server

/registry/namespaces/kube-node-lease

/registry/namespaces/kube-public

/registry/namespaces/kube-system

/registry/namespaces/kubernetes-dashboard

Create a test configmap, before backup

kubectl create configmap backup-etcd-test

Backup, the data will be saved in a file named “my-backup”

ETCDCTL_API=3 sudo etcdctl snapshot save my-backup --endpoints https://127.0.0.1:2379 --cert=/etc/kubernetes/pki/etcd/server.crt --key=/etc/kubernetes/pki/etcd/server.key --cacert=/etc/kubernetes/pki/etcd/ca.crt

Delete the test configmap

kubectl delete configmap backup-etcd-test

Extract the backup file to my-backup.etcd folder

sudo etcdutl snapshot restore my-backup --data-dir my-backup.etcd

Stop kube-apiserver and etcd static pods on your Control Plane’s by moving the manifests, I have deployed my cluster via kubeadm and the kube-apiserver it’s running in a pod. (If in your case is running as a service, use systemctl to stop it)

sudo mv /etc/kubernetes/manifests/kube-apiserver.yaml /tmp
sudo mv /etc/kubernetes/manifests/etcd.yaml /tmp

Move backup.etcd folder to /var/lib/etcd/

sudo rm -rf /var/lib/etcd/
sudo mv my-backup.etcd /var/lib/etcd/

Start the etcd and kube-apiserver by moving back the configs.

sudo mv /tmp/kube-apiserver.yaml /etc/kubernetes/manifests/
sudo mv /tmp/etcd.yaml /etc/kubernetes/manifests/

Recomandation would be to restart also other components like kube-scheduler, kube-controller-manager, kubelet to ensure that they don’t rely on some stale data.

Test, our config map is back!

kubectl get configmap
NAME               DATA   AGE
backup-etcd-test   0      45m
kube-root-ca.crt   1      70d

Leave a Reply

Your email address will not be published.