Skip to main content

第 12 章:長期維運與備份更新

etcd backup and restore operations

Backing up etcd

etcd 資料的備份可以透過 etcdctl 命令列建立快照 snashot 進行,備份產生的資料應該盡快複製到叢集外一個安全的地方保存

對於 kubeadm 搭建的集群, etcd 是運行在一個 pod 裡, 資料儲存在 /var/lib/etcd,這個目錄透過 hostPath mount 到 master 節點上

  • 安裝 jq 命令列 Json 處理工具
sudo apt-get install jq
  • 查看 etcd 的設定 jsonpath
kubectl get  pod --namespace kube-system $ETCDPOD -o jsonpath='{.spec.containers[0].volumeMounts}' | jq
[{    "mountPath": "/var/lib/etcd",
"name": "etcd-data"
},
{ "mountPath": "/etc/kubernetes/pki/etcd",
"name": "etcd-certs"
}]
sudo tree /var/lib/etcd/
/var/lib/etcd/
└── member
├── snap
│   ├── 0000000000000016-00000000001bc619.snap
│   ├── 0000000000000016-00000000001bed2a.snap
│   ├── 0000000000000016-00000000001c143b.snap
│   ├── 0000000000000016-00000000001c3b4c.snap
│   ├── 0000000000000016-00000000001c625d.snap
│   └── db
└── wal
├── 0.tmp
├── 000000000000000f-000000000015c31b.wal
├── 0000000000000010-0000000000173561.wal
├── 0000000000000011-000000000018a661.wal
├── 0000000000000012-00000000001a1b93.wal
└── 0000000000000013-00000000001b8e6c.wal
$ 3 directories, 12 files
apt install etcd-client
sudo ETCDCTL_API=3 etcdctl --endpoints=https://127.0.0.1:2379 \
--cacert=/etc/kubernetes/pki/etcd/ca.crt \
--cert=/etc/kubernetes/pki/etcd/server.crt \
--key=/etc/kubernetes/pki/etcd/server.key \
snapshot save /var/lib/dat-backup.db
2024-01-10 09:50:08.219021 I | clientv3: opened snapshot stream; downloading
2024-01-10 09:50:08.260202 I | clientv3: completed snapshot read; closing
Snapshot saved at /var/lib/dat-backup.db
sudo ETCDCTL_API=3 etcdctl --write-out=table \
snapshot status /var/lib/dat-backup.db
sudo ETCDCTL_API=3 etcdctl --write-out=table \
snapshot status /var/lib/dat-backup.db
+----------+----------+------------+------------+
| HASH | REVISION | TOTAL KEYS | TOTAL SIZE |
+----------+----------+------------+------------+
| ddcc0eaf | 10994 | 802 | 2.4 MB |
+----------+----------+------------+------------+

Restoring etcd with etctl

$ sudo ETCDCTL_API=3 etcdctl snapshot restore /var/lib/dat-backup.db

# 备份一下恢复之前的数据,以防止恢复失败
$ mv /var/lib/etcd /var/lib/etcd.OLD

# 复制恢复数据
$ sudo mv ./default.etcd /var/lib/etcd

# 停止 etcd 容器
# 找到容器 ID
sudo crictl --runtime-endpoint unix:///run/containerd/containerd.sock ps
# stop
sudo crictl --runtime-endpoint unix:///run/containerd/containerd.sock stop 12aa2cc38d214<container id>
sudo crictl --runtime-endpoint unix:///run/containerd/containerd.sock ps
CONTAINER IMAGE CREATED STATE NAME ATTEMPT POD ID POD
12aa2cc38d214 ead0a4a53df89 2 hours ago Running coredns 0 55c19e3fe129e coredns-5dd5756b68-m5h7c
afd918d9a67ad ead0a4a53df89 2 hours ago Running coredns 0 3a75a1b46a383 coredns-5dd5756b68-rwbt8
80f09cada5a3a 0dc86fe0f22e6 2 hours ago Running kube-flannel 0 dd900068e0780 kube-flannel-ds-n97mk
95cb9f516b28e 01cf8d1d322dd 2 hours ago Running kube-proxy 0 fba8030e3c20e kube-proxy-nbfck
e0eda3430381a c527ad14e0cd5 2 hours ago Running kube-controller-manager 0 8617653d93c9b kube-controller-manager-k8s-msr-1
352946671d00a 9ecc4287300e3 2 hours ago Running kube-apiserver 0 22c25b3226e74 kube-apiserver-k8s-msr-1
bca22a7dd2a5b 73deb9a3f7025 2 hours ago Running etcd 0 027d1c481a835 etcd-k8s-msr-1
13b45c643ca97 babc03668f18a 2 hours ago Running kube-scheduler 0 6706492009923 kube-scheduler-k8s-msr-1

upgrade kubeadm-based Cluster

檢查目前版本:

kubectl version

Client Version: v1.28.0
Kustomize Version: v5.0.4-0.20230601165947-6ce0bf390ce3
Server Version: v1.28.5

kubectl get nodes

NAME        STATUS   ROLES           AGE    VERSION
k8s-msr-1 Ready control-plane 119m v1.28.0
k8s-wrk-1 Ready <none> 118m v1.28.0
k8s-wrk-2 Ready <none> 118m v1.28.0
  • kubeadm version
kubeadm version: &[version.Info](http://version.info/){Major:"1", Minor:"28", GitVersion:"v1.28.0", GitCommit:"855e7c48de7388eb330da0f8d9d2394ee818fb8d", GitTreeState:"clean", BuildDate:"2023-08-15T10:20:15Z", GoVersion:"go1.20.7", Compiler:"gc", Platform:"linux/amd64"}

更新 Control Plane

# 以 Ubuntu 為例
# update kubeadm
sudo apt-mark unhold kubeadm
sudo apt-get update
sudo apt-cache policy kubeadm
sudo apt-get install -y kubeadm=$TARGET_VERSION
sudo apt-mark hold kubeadm

# drain master node
kubectl drain k8s-master --ignore-daemonsets

sudo kubeadm upgrade plan
sudo kubeadm upgrade apply v$TARGET_VERSION

# uncordon
kubectl uncordon k8s-master

# update kubelet and kubectl
sudo apt-mark unhold kubelt kubectl
sudo apt-get update
sudo apt-get install -y kubelet=$TARGET_VERSION kubectl=$TARGET_VERSION
sudo apt-mark hold kubelet kubectl

更新 work node

# 以 Ubuntu 為例
# go to master node
kubectl drain k8s-worker1 --ingore-daemonsets

# update kubeadm
sudo apt-mark unhold kubeadm
sudo apt-get update
sudo apt-get install -y kubeadm=$TARGET_VERSION
sudo apt-mark hold kubeadm

sudo kubeadm upgrade node

# update kubelet and kubectl
sudo apt-mark unhold kubelt
sudo apt-get update
sudo apt-get install -y kubelet=$TARGET_VERSION
sudo apt-mark hold kubelet

# go to master node, uncordon this node
kubectl uncordon k8s-worker1