eks.tke.cloud.tencent.com/pod-ip: "xx.xx.xx.xx"
.NodeIP:Port
. Instead, you need to use ClusterIP:Port
to access the services.eks.tke.cloud.tencent.com/metrics-port: "9110"
.The following describes how to migrate resources from TKE cluster A to TKE Serverless cluster B.
For operation details, see Creating a bucket.
Download the latest version of Velero to the cluster environment. Velero v1.8.1 is used as an example in this document.
wget https://github.com/vmware-tanzu/velero/releases/download/v1.8.1/velero-v1.8.1-linux-amd64.tar.gz
Run the following command to decompress the installation package, which contains Velero command lines and some sample files.
tar -xvf velero-v1.8.1-linux-amd64.tar.gz
Run the following command to migrate the Velero executable file from the decompressed directory to the system environment variable directory, that is, /usr/bin
in this document, as shown below:
cp velero-v1.8.1-linux-amd64/velero /usr/bin/
Configure the Velero client and enable CSI.
velero client config set features=EnableCSI
Run the following command to install Velero in clusters A and B and create Velero workloads as well as other necessary resource objects.
velero install --provider aws \
--plugins velero/velero-plugin-for-aws:v1.1.0,velero/velero-plugin-for-csi:v0.2.0 \
--features=EnableCSI \
--features=EnableAPIGroupVersions \
--bucket <BucketName> \
--secret-file ./credentials-velero \
--use-volume-snapshots=false \
--backup-location-config region=ap-guangzhou,s3ForcePathStyle="true",s3Url=https://cos.ap-guangzhou.myqcloud.com
Note:TKE Serverless clusters do not support DaemonSet deployment, so none of the samples in this document support the restic add-on.
./velero install --provider aws --use-volume-snapshots=false --bucket gtest-1251707795 --plugins velero/velero-plugin-for-aws:v1.1.0 --secret-file ./credentials-velero --backup-location-config region=ap-guangzhou,s3ForcePathStyle="true",s3Url=https://cos.ap-guangzhou.myqcloud.com
For installation parameters, see Using COS as Velero Storage to Implement Backup and Restoration of Cluster Resources or run the velero install --help
command.
Other installation parameters are as described below:
Parameter | Configuration |
---|---|
--plugins | Use the AWS S3 API-compatible add-on `velero-plugin-for-aws`; use the CSI add-on velero-plugin-for-csi to back up `csi-pv`. We recommend you enable it. |
--features | Enable optional features:Enable the API group version feature. This feature is used for compatibility with different API group versions and we recommend you enable it.Enable the CSI snapshot feature. This feature is used to back up the CSI-supported PVC, so we recommend you enable it. |
--use-restic | Velero supports the restic open-source tool to back up and restore Kubernetes storage volume data (hostPath volumes are not supported. For details, see here). It's used to supplement the Velero backup feature. During the migration to a TKE Serverless cluster, enabling this parameter will fail the backup. |
--use-volume-snapshots=false | Disable the default snapshot backup of storage volumes. |
velero backup-location get
NAME PROVIDER BUCKET/PREFIX PHASE LAST VALIDATED ACCESS MODE DEFAULT
default aws <BucketName> Available 2022-03-24 21:00:05 +0800 CST ReadWrite true
At this point, you have completed the Velero installation. For more information, see Velero Documentation.
VolumeSnapshotClass
in clusters A and BNote:
- Skip this step if you don't need to back up the PVC.
- For more information on storage snapshot, see Backing up and Restoring PVC via CBS-CSI Add-on.
Check that you have installed the CBS-CSI add-on.
You have granted related permissions of CBS snapshot for TKE_QCSRole
on the Access Management page of the console. For details, see CBS-CSI.
Use the following YAML to create a VolumeSnapshotClass object, as shown below:
apiVersion: snapshot.storage.k8s.io/v1beta1
kind: VolumeSnapshotClass
metadata:
labels:
velero.io/csi-volumesnapshot-class: "true"
name: cbs-snapclass
driver: com.tencent.cloud.csi.cbs
deletionPolicy: Delete
Run the following command to check whether the VolumeSnapshotClass has been created successfully, as shown below:
$ kubectl get volumesnapshotclass
NAME DRIVER DELETIONPOLICY AGE
cbs-snapclass com.tencent.cloud.csi.cbs Delete 17m
Note:Skip this step if you don't need to back up the PVC.
Deploy a MinIO workload with the PVC in a Velero instance in cluster A. Here, the cbs-csi
dynamic storage class is used to create the PVC and PV.
Use provisioner
in the cluster to dynamically create the PV for the com.tencent.cloud.csi.cbs
storage class. A sample PVC is as follows:
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
annotations:
volume.beta.kubernetes.io/storage-provisioner: com.tencent.cloud.csi.cbs
name: minio
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 10Gi
storageClassName: cbs-csi
volumeMode: Filesystem
Use the Helm tool to create a MinIO testing service that references the above PVC. For more information on MinIO installation, see here. In this sample, a load balancer has been bound to the MinIO service, and you can access the management page by using a public network address.
Log in to the MinIO web management page and upload the images for testing as shown below:
kubectl patch backupstoragelocation default --namespace velero \
--type merge
--patch '{"spec":{"accessMode":"ReadWrite"}}'
Timeout to ensure pod sandbox
error is reported: The add-ons in TKE Serverless cluster Pods communicate with the control plane for health checks. If the network remains disconnected for six minutes after Pod creation, the control plane will initiate the termination and recreation. In this case, you need to check whether the security group associated with the Pod has allowed access to the 169.254 route.kubectl logs
command, adversely affecting debugging. You can dump the business logs by delaying the termination or setting the terminationMessage
field as instructed in How to set container's termination message?.ImageGCFailed
error is reported: A TKE Serverless cluster Pod has 20 GiB disk size by default. If the disk usage reaches 80%, the TKE Serverless cluster control plane will trigger the container image repossession process to try to repossess the unused images and free up the space. If it fails to free up any space, ImageGCFailed: failed to garbage collect required amount of images
will be reported to remind you that the disk space is insufficient. Common causes of insufficient disk space include:
Was this page helpful?