Backup and restore etcd of a kubernetes cluster

Etcd is distributed key-value database. It’s designed to store small amounts of data that can be stored in memory.
It supports high availability and widely used in kubernetes clusters.
In this post we will describe how to backup and restore the data stored in the etcd database in a context of a kubernetes cluster.

Resources
https://etcd.io/
https://github.com/etcd-io/etcd/releases/tag/v3.5.0
https://kubernetes.io/docs/tasks/administer-cluster/configure-upgrade-etcd/#restoring-an-etcd-cluster

Note: this procedure is for demonstrating purpose on a cluster created with kubeadm (one Control Plane and two Workers), depending on your setup this should be tailored on your specific case!

Download and install etcd containing the server and utilities on your Control Plane, it would make sense to use the same version as you have installed, in my case will be v3.5.0.

wget https://storage.googleapis.com/etcd/v3.5.0/etcd-v3.5.0-linux-amd64.tar.gz
tar -zxvf etcd-v3.5.0-linux-amd64.tar.gz
sudo cp etcd-v3.5.0-linux-amd64/etcdctl /usr/bin/
sudo cp etcd-v3.5.0-linux-amd64/etcdutl /usr/bin/

Test etcdctl by getting the namespaces, edit your endpoint, in my case as I have only a Control Plane etcd is listening only on 127.0.0.1

ETCDCTL_API=3 sudo etcdctl get /registry/namespaces --prefix --keys-only --endpoints https://127.0.0.1:2379 --cert=/etc/kubernetes/pki/etcd/server.crt --key=/etc/kubernetes/pki/etcd/server.key --cacert=/etc/kubernetes/pki/etcd/ca.crt
/registry/namespaces/default

/registry/namespaces/home-server

/registry/namespaces/kube-node-lease

/registry/namespaces/kube-public

/registry/namespaces/kube-system

/registry/namespaces/kubernetes-dashboard

Create a test configmap, before backup

kubectl create configmap backup-etcd-test

Backup, the data will be saved in a file named “my-backup”

ETCDCTL_API=3 sudo etcdctl snapshot save my-backup --endpoints https://127.0.0.1:2379 --cert=/etc/kubernetes/pki/etcd/server.crt --key=/etc/kubernetes/pki/etcd/server.key --cacert=/etc/kubernetes/pki/etcd/ca.crt

Delete the test configmap

kubectl delete configmap backup-etcd-test

Extract the backup file to my-backup.etcd folder

sudo etcdutl snapshot restore my-backup --data-dir my-backup.etcd

Stop kube-apiserver and etcd static pods on your Control Plane’s by moving the manifests, I have deployed my cluster via kubeadm and the kube-apiserver it’s running in a pod. (If in your case is running as a service, use systemctl to stop it)

sudo mv /etc/kubernetes/manifests/kube-apiserver.yaml /tmp
sudo mv /etc/kubernetes/manifests/etcd.yaml /tmp

Move backup.etcd folder to /var/lib/etcd/

sudo rm -rf /var/lib/etcd/
sudo mv my-backup.etcd /var/lib/etcd/

Start the etcd and kube-apiserver by moving back the configs.

sudo mv /tmp/kube-apiserver.yaml /etc/kubernetes/manifests/
sudo mv /tmp/etcd.yaml /etc/kubernetes/manifests/

Recomandation would be to restart also other components like kube-scheduler, kube-controller-manager, kubelet to ensure that they don’t rely on some stale data.

Test, our config map is back!

kubectl get configmap
NAME               DATA   AGE
backup-etcd-test   0      45m
kube-root-ca.crt   1      70d

Create kubeconfig with “normal user” and x509 client certificates

Kubernetes has two categories of users, “normal users” and service accounts.
Service accounts are managed by Kubernetes while “normal users” are not, there are no objects added into the cluster in order to represent them. In this post, we will created a “normal user” with x509 client certificate and use it in our Kubernetes cluster.

Reference:
Kubernetes documentation

Create and sign the certificate

The following commands should be run on the Control Plane, as we need the CA certificate from the Kubernetes cluster.

Generate our private key

openssl genrsa -out oueta.key 2048

Create Certificate Signing Request (CSR).
Kubernetes determines the username from the common name field in the ‘subject’ of the cert, in my example CN=oueta/O=Group1/O=Group2, meaning the user “oueta” being part of two groups, Group1 and Group2. The groups can be none or many.

openssl req -new -key oueta.key -out oueta.csr -subj "/CN=oueta/O=Group1/O=Group2"

Sign the certificate with our kubernetes cluster’s Certificate Authority (CA), valid for 365 days.

sudo openssl x509 -req -in oueta.csr -CA /etc/kubernetes/pki/ca.crt -CAkey /etc/kubernetes/pki/ca.key -CAcreateserial -out oueta.crt -days 365

Create the kubeconfig file with our cluster and authentication information.

kubectl config set-cluster kubernetes --certificate-authority=/etc/kubernetes/pki/ca.crt --embed-certs=true --server=https://10.255.0.252:6443 --kubeconfig=oueta.kubeconfig
kubectl config set-credentials oueta --client-certificate=oueta.crt --client-key=oueta.key --embed-certs=true --kubeconfig=oueta.kubeconfig
kubectl config set-context default --cluster=kubernetes --user=oueta --kubeconfig=oueta.kubeconfig
kubectl config use-context default --kubeconfig=oueta.kubeconfig

We are ready with generating and signing the certificates, we are ready with the kubeconfig, let’s test.

kubectl get pods --kubeconfig=oueta.kubeconfig
Error from server (Forbidden): pods is forbidden: User "oueta" cannot list resource "pods" in API group "" in the namespace "default"             0           0           0

As expected, because we don’t have any kind of access. In kubernetes we can define two types of permissions, Role and ClusterRole. With a Role object we can define permissions within a single namespace while with ClusterRole we can define cluster-scoped permissions. More information about Role-based access control (RBAC) in the Kubernetes documentation.

Create a Role and define the permissions.

cat << EOF | kubectl create -f -
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
  creationTimestamp: null
  name: oueta-role
rules:
- apiGroups:
  - ""
  resources:
  - pods
  - pods/log
  verbs:
  - get
  - list
  - watch
EOF

Creating a RoleBinding and bind our user to our earlier create Role.

cat << EOF | kubectl apply -f -
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
  creationTimestamp: null
  name: oueta-binding
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: Role
  name: oueta-role
subjects:
- apiGroup: rbac.authorization.k8s.io
  kind: User
  name: oueta
EOF

Test

kubectl get pods --kubeconfig=oueta.kubeconfig
NAME                         READY   STATUS    RESTARTS   AGE
test-nginx-59ffd87f5-vvpdt   1/1     Running   0          111s