The open source tool Velero (formerly known as the Heptio Ark) can safely back up and restore, perform disaster recovery, and migrate Kubernetes cluster resources and persistent volumes. TKE supports using Velero to back up, restore and migrate cluster resources. For more information, see Using COS as Velero Storage to Implement Backup and Restoration of Cluster Resources and Using Velero to Migrate and Replicate Cluster Resources in TKE. This document describes how to use Velero to seamlessly migrate self-built or other cloud platform Kubernetes clusters to TKE.
The principle of using Velero to migrate self-built or other cloud platform cluster is similar to the principle of Using Velero to Replicate Cluster Resources in TKE. Both the source cluster and the target cluster for migration need to install Velero instances and specify the same Tencent Cloud COS bucket. According to the actual needs, the source cluster performs backup, and the target cluster restores cluster resources, so as to implement resource migration.
The difference is that when you migrate cluster resources from self-built or other cloud platforms to TKE, you need to consider and solve the problem of cluster environment differences caused by cross-platform. You can refer to the practical backup and restore strategies provided by Velero to solve the problems.
Before migration, it is recommended that you make a detailed migration plan, and consider the following points during the migration process:
Filter and classify the resource inventories that need migration and that do not need migration based on actual needs.
Write backup and restoration strategies based on the filtered and classified resource inventories. It is recommended to use the method of creating resource inventories to perform backup and restoration in complex scenarios. The YAML resource inventory is intuitive and easy to maintain. For simple migration or test scenarios, you can specify parameters to implement backup and restoration.
Due to the cross-cloud platform migration, the relationships of the dynamic storage classes for creating PVC may be different. You need to plan in advance whether the relationships of dynamic PVC/PV storage classes need to be remapped. And you need to create the ConfigMap configuration of the relevant mapping before the restoration. To solve more personalized differences, you can manually modify the backup resource inventory.
Check whether the migrated cluster resources meet expectations and the data is complete and available.
The following describes the detailed steps of migrating resources from a cloud platform cluster A to TKE cluster B. For the involved the basic knowledge of Velero backup and restoration, please refer to Practical Velero backup/restoration Knowledge.
Deploy an Nginx workload with PVC in Velero instance in cluster A. For convenience, you can directly use dynamic storage class to create PVC and PV.
Run the following command to view the dynamic storage class information supported by the current cluster, as shown below:
# Get the storage class information supported by the current cluster, where xxx-StorageClass is the storage class code name, and xxx-Provider is the provider code name (the same below).
$ kubectl get sc
NAME PROVISIONER RECLAIMPOLICY VOLUMEBINDINGMODE ALLOWVOLUMEEXPANSION AGE
xxx-StorageClass xxx-Provider Delete Immediate true 3d3h
...
Modify the PVC resource inventory in the with-pv.yaml file, and use the storage class named "xxx-StorageClass" in the cluster to dynamically create, as shown below:
...
---
kind: PersistentVolumeClaim
apiVersion: v1
metadata:
name: nginx-logs
namespace: nginx-example
labels:
app: nginx
spec:
# Optional: modify the value of the PVC storage class to the cloud platform of cluster A.
storageClassName: xxx-StorageClass
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 20Gi # Since the minimum storage of this cloud platform is 20 Gi, you need to modify the storage to 20 Gi in this sample.
...
Run the following command to apply with-pv.yaml in the sample to create the following cluster resources (nginx-example namespace), as shown below:
$ kubectl apply -f with-pv.yaml
namespace/nginx-example created
persistentvolumeclaim/nginx-logs created
deployment.apps/nginx-deployment created
service/my-nginx created
The created PVC “nginx-logs” has been mounted to the /var/log/nginx
directory of the Nginx container as the log storage of service. The sample here will test and access the Nginx service in the browser to generate log data for the mounted PVC for data comparison after restoration, as shown below:
$ kubectl exec -it nginx-deployment-5ccc99bffb-6nm5w bash -n nginx-example
kubectl exec [POD] [COMMAND] is DEPRECATED and will be removed in a future version. Use kubectl kubectl exec [POD] -- [COMMAND]
Defaulting container name to nginx.
Use 'kubectl describe pod/nginx-deployment-5ccc99bffb-6nm5w -n nginx-example' to see all of the containers in this pod
$ du -sh /var/log/nginx/
84K /var/log/nginx/
# View the first two logs of accss.log and error.log.
$ head -n 2 /var/log/nginx/access.log
192.168.0.73 - - [29/Dec/2020:03:02:31 +0000] "GET /?spm=5176.2020520152.0.0.22d016ddHXZumX HTTP/1.1" 200 612 "-" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/87.0.4280.88 Safari/537.36" "-"
192.168.0.73 - - [29/Dec/2020:03:02:32 +0000] "GET /favicon.ico HTTP/1.1" 404 555 "http://47.242.233.22/?spm=5176.2020520152.0.0.22d016ddHXZumX" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/87.0.4280.88 Safari/537.36" "-"
$ head -n 2 /var/log/nginx/error.log
2020/12/29 03:02:32 [error] 6#6: *597 open() "/usr/share/nginx/html/favicon.ico" failed (2: No such file or directory), client: 192.168.0.73, server: localhost, request: "GET /favicon.ico HTTP/1.1", host: "47.242.233.22", referrer: "http://47.242.233.22/?spm=5176.2020520152.0.0.22d016ddHXZumX"
2020/12/29 03:07:21 [error] 6#6: *1172 open() "/usr/share/nginx/html/0bef" failed (2: No such file or directory), client: 192.168.0.73, server: localhost, request: "GET /0bef HTTP/1.0"
Run the following command to output all resource inventories in cluster A.
kubectl api-resources --verbs=list -o name | xargs -n 1 kubectl get --show-kind --ignore-not-found --all-namespaces
You can also run the following commands to distinguish namespaces based on resources and narrow the scope of output resources:
View the resource inventories that do not distinguish namespaces:
kubectl api-resources --namespaced=false --verbs=list -o name | xargs -n 1 kubectl get --show-kind --ignore-not-found
View the resource inventories that distinguish namespaces:
kubectl api-resources --namespaced=true --verbs=list -o name | xargs -n 1 kubectl get --show-kind --ignore-not-found --all-namespaces
You can filter the resource inventories that need to migrate based on the actual needs. The sample here will directly migrate Nginx workload-related resources under the "nginx-example" namespace from this cloud platform to TKE. The resources involved are as follows:
$ kubectl get all -n nginx-example
NAME READY STATUS RESTARTS AGE
pod/nginx-deployment-5ccc99bffb-tn2sh 2/2 Running 0 2d19h
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/my-nginx LoadBalancer 172.21.1.185 x.x.x.x 80:31455/TCP 2d19h
NAME READY UP-TO-DATE AVAILABLE AGE
deployment.apps/nginx-deployment 1/1 1 1 2d19h
NAME DESIRED CURRENT READY AGE
replicaset.apps/nginx-deployment-5ccc99bffb 1 1 1 2d19h
$ kubectl get pvc -n nginx-example
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
nginx-logs Bound d-j6ccrq4k1moziu1l6l5r 20Gi RWO xxx-StorageClass 2d19h
$ kubectl get pv
NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS REASON AGE
d-j6ccrq4k1moziu1l6l5r 20Gi RWO Delete Bound nginx-example/nginx-logs xxx-StorageClass 2d19h
The sample has configured a Hook strategy that is "setting the file system to read-only before backing up the Nginx workload and restoring it to read/write after the backup” in with-pv.yaml. The YAML file is as follows:
...
annotations:
# The annotation of the backup hook strategy indicates that the nginx log directory is set to read-only mode before starting the backup, and is restored to read/write mode after the backup is completed.
pre.hook.backup.velero.io/container: fsfreeze
pre.hook.backup.velero.io/command: '["/sbin/fsfreeze", "--freeze", "/var/log/nginx"]'
post.hook.backup.velero.io/container: fsfreeze
post.hook.backup.velero.io/command: '["/sbin/fsfreeze", "--unfreeze", "/var/log/nginx"]'
spec:
volumes:
- name: nginx-logs
persistentVolumeClaim:
claimName: nginx-logs
containers:
- image: nginx:1.17.6
name: nginx
ports:
- containerPort: 80
volumeMounts:
- mountPath: "/var/log/nginx"
name: nginx-logs
readOnly: false
- image: ubuntu:bionic
name: fsfreeze
securityContext:
privileged: true
volumeMounts:
- mountPath: "/var/log/nginx"
name: nginx-logs
...
Write a backup and restoration strategy based on the actual situation, and begin to migrate the Nginx workload related resources of the cloud platform.
Create the following YAML file to back up the resources that need to migrate.
apiVersion: velero.io/v1
kind: Backup
metadata:
name: migrate-backup
# Must be the namespace installed by velero.
namespace: velero
spec:
# The resources that only contains the nginx-example namespace.
includedNamespaces:
- nginx-example
# The resources that do not distinguish namespace.
includeClusterResources: true
# Specify the storage location of the backup data.
storageLocation: default
# Specify the storage location of the volume snapshot.
volumeSnapshotLocations:
- default
# Use restic to back up the volume.
defaultVolumesToRestic: true
The backup process is shown below. When the backup status is "Completed" and the number of errors is 0, the backup process is complete and correct.
$ kubectl apply -f backup.yaml
backup.velero.io/migrate-backup created
$ velero backup get
NAME STATUS ERRORS WARNINGS CREATED EXPIRES STORAGE LOCATION SELECTOR
migrate-backup InProgress 0 0 2020-12-29 19:24:12 +0800 CST 29d default <none>
$ velero backup get
NAME STATUS ERRORS WARNINGS CREATED EXPIRES STORAGE LOCATION SELECTOR
migrate-backup Completed 0 0 2020-12-29 19:24:28 +0800 CST 29d default <none>
After the backup is complete, run the following command to temporarily update the backup storage location to read-only mode, as shown below:
NoteThis can prevent Velero from creating or deleting backup objects in the backup storage location during the restoration. (Optional)
kubectl patch backupstoragelocation default --namespace velero \
--type merge \
--patch '{"spec":{"accessMode":"ReadOnly"}}'
Due to differences in the dynamic storage classes used, you need to create a dynamic storage class name mapping for the persistent volume "nginx-logs" through the ConfigMap shown below.
apiVersion: v1
kind: ConfigMap
metadata:
name: change-storage-class-config
namespace: velero
labels:
velero.io/plugin-config: ""
velero.io/change-storage-class: RestoreItemAction
data:
# Storage class name is mapped to Tencent cloud dynamic storage class cbs.
xxx-StorageClass: cbs
Run the following command to apply the above ConfigMap configuration, as shown below:
$ kubectl apply -f cm-storage-class.yaml
configmap/change-storage-class-config created
The resource inventories backed up by Velero is stored in COS in JSON format. If you have a more personalized migration requirement, you can directly download the backup file and customize it. The sample below will add a "jokey-test:jokey-test" annotation to the Deployment resource of Nginx. The modification process is as follows:
$ Downloads % mkdir migrate-backup
# Decompress the backup file.
$ Downloads % tar -zxvf migrate-backup.tar.gz -C migrate-backup
# Edit the resources that need to be customized. In the sample below, "jokey-test" is added to the Deployment resource of Nginx: "jokey-test" annotation.
$ migrate-backup % cat resources/deployments.apps/namespaces/nginx-example/nginx-deployment.json
{"apiVersion":"apps/v1","kind":"Deployment","metadata":{"annotations":{"jokey-test":"jokey-test",...
# Repack the modified backup files.
$ migrate-backup % tar -zcvf migrate-backup.tar.gz *
Complete the custom modification and repackage, and log in to COS console to upload the backup file and replace the original backup file.
apiVersion: velero.io/v1
kind: Restore
metadata:
name: migrate-restore
namespace: velero
spec:
backupName: migrate-backup
includedNamespaces:
- nginx-example
# Fill in the resource type to be restored as needed. There is no resource to be excluded under the nginx-example namespace, so enter '*' here.
includedResources:
- '*'
includeClusterResources: null
# Resources not included in the restoration. Here storageClasses resource types are excluded.
excludedResources:
- storageclasses.storage.k8s.io
# Use the labelSelector selector to select the resource with a specific label. Since there is no need to use the label selector to filter in this sample, please make an annotation here.
# labelSelector:
# matchLabels:
# app: nginx
# Set the relationship mapping strategy of the namespace.
namespaceMapping:
nginx-example: default
restorePVs: true
$ kubectl apply -f restore.yaml
restore.velero.io/migrate-restore created
$ velero restore get
NAME BACKUP STATUS STARTED COMPLETED ERRORS WARNINGS CREATED SELECTOR
migrate-restore migrate-backup Completed 2021-01-12 20:39:14 +0800 CST 2021-01-12 20:39:17 +0800 CST 0 0 2021-01-12 20:39:14 +0800 CST <none>
Run the following command to check whether the running status of the migrated resource is normal, as shown below:
# Since the "nginx-example" namespace is specified to map to the "default" namespace when restoration, the restored resource will run under the "default" namespace.
$ kubectl get all -n default
NAME READY STATUS RESTARTS AGE
pod/nginx-deployment-5ccc99bffb-6nm5w 2/2 Running 0 49s
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/kube-user LoadBalancer 172.16.253.216 10.0.0.28 443:30060/TCP 8d
service/kubernetes ClusterIP 172.16.252.1 <none> 443/TCP 8d
service/my-nginx LoadBalancer 172.16.254.16 x.x.x.x 80:30840/TCP 49s
NAME READY UP-TO-DATE AVAILABLE AGE
deployment.apps/nginx-deployment 1/1 1 1 49s
NAME DESIRED CURRENT READY AGE
replicaset.apps/nginx-deployment-5ccc99bffb 1 1 1 49s
From the command execution result, you can find that the running status of the migrated resource is normal.
Check whether the set restoration strategy is successful.
Run the following command to check whether the mapping of the dynamic storage class name is correct, as shown below:
# You can find that the storage class of PVC/PV is already "cbs", indicating that the storage class mapping is successful.
$ kubectl get pvc -n default
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
nginx-logs Bound pvc-bcc17ccd-ec3e-4d27-bec6-b0c8f1c2fa9c 20Gi RWO cbs 55s
$ kubectl get pv
NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS REASON AGE
pvc-bcc17ccd-ec3e-4d27-bec6-b0c8f1c2fa9c 20Gi RWO Delete Bound default/nginx-logs cbs 57s
If the storage class of PVC/PV is "cbs", the storage class mapping is successful. From the execution result of the above command, you can find that the storage class mapping is successful.
Run the following command to check whether the custom-added "jokey-test" annotation for "deployment.apps/nginx-deployment" before restoration is successful, as shown below:
# Obtain the annotation "jokey-test" successfully, indicating that the custom modification of the resource is successful.
$ kubectl get deployment.apps/nginx-deployment -o custom-columns=annotations:.metadata.annotations.jokey-test
annotations
jokey-test
If the annotations can be obtained normally, the custom resource has been modified successfully. From the execution result of the above command, you can find that the namespace mapping configuration is successful.
Run the following command to check whether the PVC data mounted by the workload is successfully migrated.
# Check the data size in the mounted PVC data directory. The data size is 88K, which is more than the size before the migration. The reason is that Tencent Cloud CLB actively initiated a health check and generated some logs.
$ kubectl exec -it nginx-deployment-5ccc99bffb-6nm5w -n default -- bash
Defaulting container name to nginx.
Use 'kubectl describe pod/nginx-deployment-5ccc99bffb-6nm5w -n default' to see all of the containers in this pod.
$ du -sh /var/log/nginx
88K /var/log/nginx
# Check the first two log information, which is the same as the log before the migration, indicating that the PVC data is not lost.
$ head -n 2 /var/log/nginx/access.log
192.168.0.73 - - [29/Dec/2020:03:02:31 +0000] "GET /?spm=5176.2020520152.0.0.22d016ddHXZumX HTTP/1.1" 200 612 "-" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/87.0.4280.88 Safari/537.36" "-"
192.168.0.73 - - [29/Dec/2020:03:02:32 +0000] "GET /favicon.ico HTTP/1.1" 404 555 "http://47.242.233.22/?spm=5176.2020520152.0.0.22d016ddHXZumX" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/87.0.4280.88 Safari/537.36" "-"
$ head -n 2 /var/log/nginx/error.log
2020/12/29 03:02:32 [error] 6#6: *597 open() "/usr/share/nginx/html/favicon.ico" failed (2: No such file or directory), client: 192.168.0.73, server: localhost, request: "GET /favicon.ico HTTP/1.1", host: "47.242.233.22", referrer: "http://47.242.233.22/?spm=5176.2020520152.0.0.22d016ddHXZumX"
2020/12/29 03:07:21 [error] 6#6: *1172 open() "/usr/share/nginx/html/0bef" failed (2: No such file or directory), client: 192.168.0.73, server: localhost, request: "GET /0bef HTTP/1.0"
This document mainly describes the ideas and methods of using Velero to migrate the resources of the self-built or other cloud platform clusters to TKE, and shows the sample that the cluster resources in cluster A are successfully migrated to cluster B seamlessly. If you encounter scenarios that are not covered in this document during the actual migration, please submit a ticket.
Velero provides many useful backup and restoration strategies, as shown below:
Velero includes all objects in a backup or restoration when no filtering options are used. You can specify parameters to filter resources as needed during backup and restoration. For details, please refer to Resource Filtering.
Parameters | Description |
---|---|
--include-resources |
Specify a list of resource objects to include. |
--include-namespaces |
Specify a list of namespaces to include. |
--include-cluster-resources |
Specify whether to include resources of the cluster. |
--selector |
Specify to include the resources that match the label selector. |
Parameters | Description |
---|---|
--exclude-namespaces |
Specify a list of namespaces to be excluded. |
--exclude-resources |
Specify a list of resource objects to be excluded. |
velero.io/exclude-from-backup=true |
This configuration item will configure this label attribute for the resource object, and the resource object with this label will be excluded. |
Starting from Velero v1.5, Velero uses Restic to back up all Pod volumes by default instead of annotating each Pod separately. Velero v1.5 or later is recommended
For the Velero version that is earlier than v1.5, when Velero uses Restic to back up volumes, Restic provides the following two ways to find the Pod volumes that need to be backed up:
The used Pod volume backup selects to contain an annotation (default):
kubectl -n <YOUR_POD_NAMESPACE> annotate <pod/YOUR_POD_NAME> backup.velero.io/backup-volumes=<YOUR_VOLUME_NAME_1,YOUR_VOLUME_NAME_2,...>
The used Pod volume backup selects to not contain an annotation:
kubectl -n <YOUR_POD_NAMESPACE> annotate <pod/YOUR_POD_NAME> backup.velero.io/backup-volumes-excludes=<YOUR_VOLUME_NAME_1,YOUR_VOLUME_NAME_2,...>
After the backup is complete, run the following command to view the backup volume information:
kubectl -n velero get podvolumebackups -l velero.io/backup-name=<YOUR_BACKUP_NAME> -o yaml
After the restoration is complete, run the following command to view the restore volume information:
kubectl -n velero get podvolumerestores -l velero.io/restore-name=<YOUR_RESTORE_NAME> -o yaml
Was this page helpful?