EvenChan's Ops.

k8s集群备份与恢复

字数统计: 913阅读时长: 5 min
2020/05/29

介绍

​ k8s集群的备份与恢复,主要就是etcd集群的备份与恢复。

ETCD一些基础查询操作

查看集群状态

1
2
3
4
5
6
二进制:
ETCDCTL_API=3 etcdctl --cacert=/etc/kubernetes/cert/ca.pem --cert=/etc/etcd/cert/etcd.pem --key=/etc/etcd/cert/etcd-key.pem --endpoints=https://10.16.2.17:2379,https://10.16.2.18:2379,https://10.16.2.19:2379 endpoint health


阿里kubeadm:
ETCDCTL_API=3 etcdctl --cacert=/etc/kubernetes/pki/etcd/ca.pem --cert=/etc/kubernetes/pki/etcd/etcd-client.pem --key=/etc/kubernetes/pki/etcd/etcd-client-key.pem --endpoints=https://192.168.34.130:2379,https://192.168.34.131:2379,https://192.168.34.132:2379,https://192.168.34.133:2379,https://192.168.34.134:2379 endpoint health

获取某个key信息

1
2
3
4
5
二进制:
ETCDCTL_API=3 etcdctl --cacert=/etc/kubernetes/cert/ca.pem --cert=/etc/etcd/cert/etcd.pem --key=/etc/etcd/cert/etcd-key.pem --endpoints=https://10.16.2.17:2379,https://10.16.2.18:2379,https://10.16.2.19:2379 get /registry/apiregistration.k8s.io/apiservices/v1.apps

阿里kubeadm:
ETCDCTL_API=3 etcdctl --cacert=/etc/kubernetes/pki/etcd/ca.pem --cert=/etc/kubernetes/pki/etcd/etcd-client.pem --key=/etc/kubernetes/pki/etcd/etcd-client-key.pem --endpoints=https://192.168.34.130:2379,https://192.168.34.131:2379,https://192.168.34.132:2379,https://192.168.34.133:2379,https://192.168.34.134:2379 get /registry/apiregistration.k8s.io/apiservices/v1.apps

获取etcd版本信息

1
2
3
4
5
二进制:
ETCDCTL_API=3 etcdctl --cacert=/etc/kubernetes/cert/ca.pem --cert=/etc/etcd/cert/etcd.pem --key=/etc/etcd/cert/etcd-key.pem --endpoints=https://10.16.2.17:2379,https://10.16.2.18:2379,https://10.16.2.19:2379 version

阿里kubeadm:
ETCDCTL_API=3 etcdctl --cacert=/etc/kubernetes/pki/etcd/ca.pem --cert=/etc/kubernetes/pki/etcd/etcd-client.pem --key=/etc/kubernetes/pki/etcd/etcd-client-key.pem --endpoints=https://192.168.34.130:2379,https://192.168.34.131:2379,https://192.168.34.132:2379,https://192.168.34.133:2379,https://192.168.34.134:2379 version

获取etcd所有key

1
2
3
4
5
二进制:
ETCDCTL_API=3 etcdctl --cacert=/etc/kubernetes/cert/ca.pem --cert=/etc/etcd/cert/etcd.pem --key=/etc/etcd/cert/etcd-key.pem --endpoints=https://10.16.2.17:2379,https://10.16.2.18:2379,https://10.16.2.19:2379 get / --prefix --keys-only

阿里kubeadm:
ETCDCTL_API=3 etcdctl --cacert=/etc/kubernetes/pki/etcd/ca.pem --cert=/etc/kubernetes/pki/etcd/etcd-client.pem --key=/etc/kubernetes/pki/etcd/etcd-client-key.pem --endpoints=https://192.168.34.130:2379,https://192.168.34.131:2379,https://192.168.34.132:2379,https://192.168.34.133:2379,https://192.168.34.134:2379 get / --prefix --keys-only

备份

本文备份使用 napshot save , 每次备份一个节点就行。

1
2
二进制举例
ETCDCTL_API=3 etcdctl --cacert=/etc/kubernetes/cert/ca.pem --cert=/etc/etcd/cert/etcd.pem --key=/etc/etcd/cert/etcd-key.pem --endpoints=https://10.16.2.17:2379 snapshot save /data/etcd_backup_dir/etcd-snapshot-`date +%Y%m%d`.db

恢复

​ 停止kube-apiserver 服务,确保apiserver 服务已经停止运行

1
2
3
4
5
systemctl stop kube-apiserver

#确认 kube-apiserver 服务是否停止

ps -ef | grep kube-apiserver

​ 停止集群中所有 ETCD 服务

1
systemctl stop etcd

​ 移除 ETCD 数据

1
2
mv /data/k8s/etcd/data /data/k8s/etcd/data.bak   
mv /data/k8s/etcd/wal /data/k8s/etcd/wal.bak

​ 拷贝 ETCD 备份快照

1
2
scp /var/lib/etcd_backup/etcd-snapshot-20200414.db root@master2:/data/etcd_backup_dir/
scp /var/lib/etcd_backup/etcd-snapshot-20200414.db root@master3:/data/etcd_backup_dir/

​ 所有master上按照各自etcd的启动文件,恢复。

1
2
3
4
5
6
ETCDCTL_API=3 etcdctl snapshot restore /data/etcd_backup_dir/etcd-snapshot-20200414.db \
--name bjxg-sy-test \
--initial-cluster "bjxg-sy-test=https://10.16.2.17:2380" \
--initial-cluster-token etcd-cluster-0 \
--initial-advertise-peer-urls https://10.16.2.17:2380 \
--data-dir=/data/k8s/etcd/data --wal-dir=/data/k8s/etcd/wal

​ 查恢复的name,cluster-token等

1
2
3
4
5
6
7
8
9
10
11
12
13
[root@etcd_backup_dir]# systemctl status etcd
● etcd.service - Etcd Server
Loaded: loaded (/etc/systemd/system/etcd.service; enabled; vendor preset: disabled)
Active: active (running) since Tue 2020-04-14 11:44:42 CST; 2h 38min ago
Docs: https://github.com/coreos
Main PID: 25729 (etcd)
Tasks: 15
Memory: 55.6M
CGroup: /system.slice/etcd.service
└─25729 /opt/k8s/bin/etcd --data-dir=/data/k8s/etcd/data --wal-dir=/data/k8s/etcd/wal ........


从 /etc/systemd/system/etcd.service文件里面读取信息

​ 启动etcd,kube-apiserver

1
2
systemctl start etcd
systemctl start kube-apiserver

总结

Kubernetes 集群备份主要是备份 ETCD 集群。而恢复时,主要考虑恢复整个顺序:

1
停止kube-apiserver --> 停止ETCD --> 恢复数据 --> 启动ETCD --> 启动kube-apiserver

参考链接

1
https://www.jianshu.com/p/8b483ed49f26
CATALOG
  1. 1. 介绍
  2. 2. ETCD一些基础查询操作
    1. 2.1. 查看集群状态
    2. 2.2. 获取某个key信息
    3. 2.3. 获取etcd版本信息
    4. 2.4. 获取etcd所有key
    5. 2.5. 备份
    6. 2.6. 恢复
  3. 3. 总结
  4. 4. 参考链接