Etcd
Etcd is a consistent and highly-available key value store used as Kubernetes backing store for all cluster data.
Clustering
Etcd can be deployed on a single server for test and development or as a cluster in production to ensure the high availability of the service.
Etcd is the default database of Kubernetes, if the service goes down, Kubernetes will not be able to manage any action on the deployed objects.
Depending on the size of the cluster, Etcd is usually deployed on the master nodes but remember that in production it is recommended to dedicate servers to the Etcd cluster.
Initiating a cluster
For durability and high availability, run etcd as a multi-node cluster in production and back it up periodically. A five-member cluster is recommended in production.
Single-node etcd cluster
Use a single-node etcd cluster only for testing purpose.
Multi-node etcd cluster
Getting members
Getting all the members of an etcd cluster.
Adding member
Adding a node to an existing cluster. This command has to be done on an existing node of the cluster.
Removing member
Removing a node from an existing cluster. The NODE_ID can be find in the getting member command line response.
Starting member
Starting a node needs some metadata to configure it in the cluster. This command has to be done after adding a node in the cluster. It does not initate a cluster.
Backing up
All Kubernetes objects are stored on etcd. Periodically backing up the etcd cluster data is important to recover Kubernetes clusters under disaster scenarios, such as losing all master nodes.
Running backup
Running a backup on an existing etcd cluster.
Getting status
Getting the status of a backup.
Restoring
To restore a cluster, all that is needed is a single snapshot "db" file. A cluster restore with etcdctl snapshot restore
creates new etcd data directories; all members should restore using the same snapshot. Restoring overwrites some snapshot metadata (specifically, the member ID and cluster ID); the member loses its former identity. This metadata overwrite prevents the new member from inadvertently joining an existing cluster. Therefore in order to start a cluster from a snapshot, the restore must start a new logical cluster.
In the case of a cluster of 3 servers, The command below has to be done on each node to reconfigure the cluster based on the same backup file.
The parameters ETCD_NODE_NAMEX and ETCD_NODE_IPX must be updated based on the host information to correctly restore the cluster.
External documentation
To go further in the management of etcd, please refer to these documentations :
Official Kubernetes documentation on how operate an etcd cluster
Official Github documentation
Official Github documentation on disaster recovery
Official Github documentation on etcd clustering
Last updated