tke-autoscaling-placeholder
can be used to implement scale-out on TKE in seconds, which is suitable for scenarios with sudden traffic increases. This document introduces how to use tke-autoscaling-placeholder
to implement Auto Scaling in seconds.tke-autoscaling-placeholder
utilizes low-priority pods to preemptively occupy resources (pause containers with request, consuming only a small amount of resources), reserving some resources as a buffer for high-priority businesses prone to sudden traffic spikes. When pod scale-out is needed, high-priority pods will quickly occupy the resources of low-priority pods for scheduling. In this case, the low-priority pods of tke-autoscaling-placeholder
will change to the Pending status. If you have configured a node pool and enabled Auto Scaling, node scale-out will be triggered. As some resources are used as a buffer, even if the node scale-out process is slow, some pods can still be quickly scaled out and scheduled, achieving scaling in seconds. You can adjust the amount of resources reserved as the buffer by adjusting request in tke-autoscaling-placeholder
or the number of replicas based on your needs.tke-autoscaling-placeholder
app, the cluster version must be later than 1.18.tke-autoscaling-placeholder
to search for the app, as shown in the figure below:
replicaCount
and resources.request
, which indicate the number of replicas of tke-autoscaling-placeholder
and the amount of resources occupied by each replica, respectively. They collectively determine the size of buffer resources. You can set them based on the estimated amount of extra resources needed for sudden traffic increases.
For complete parameter configuration descriptions for tke-autoscaling-placeholder
, see the following table:Parameter Name | Description | Default Value |
replicaCount | Number of placeholder replicas | 10 |
image | placeholder image address | ccr.ccs.tencentyun.com/library/pause:latest |
resources.requests.cpu | Amount of CPU resources occupied by a single placeholder replica | 300m |
resources.requests.memory | Size of memory occupied by a single placeholder replica | 600Mi |
lowPriorityClass.create | Whether to create a low PriorityClass (to be imported by placeholder) | true |
lowPriorityClass.name | Name of the low PriorityClass | low-priority |
nodeSelector | Specifies the node with a specific label to which placeholder will be scheduled. | {} |
tolerations | Specifies the taint to be tolerated by placeholder. | [] |
affinity | Specifies the affinity configuration of placeholder. | {} |
$ kubectl get pod -n defaulttke-autoscaling-placeholder-b58fd9d5d-2p6ww 1/1 Running 0 8stke-autoscaling-placeholder-b58fd9d5d-55jw7 1/1 Running 0 8stke-autoscaling-placeholder-b58fd9d5d-6rq9r 1/1 Running 0 8stke-autoscaling-placeholder-b58fd9d5d-7c95t 1/1 Running 0 8stke-autoscaling-placeholder-b58fd9d5d-bfg8r 1/1 Running 0 8stke-autoscaling-placeholder-b58fd9d5d-cfqt6 1/1 Running 0 8stke-autoscaling-placeholder-b58fd9d5d-gmfmr 1/1 Running 0 8stke-autoscaling-placeholder-b58fd9d5d-grwlh 1/1 Running 0 8stke-autoscaling-placeholder-b58fd9d5d-ph7vl 1/1 Running 0 8stke-autoscaling-placeholder-b58fd9d5d-xmrmv 1/1 Running 0 8s
tke-autoscaling-placeholder
is low. You can specify a high PriorityClass for its business pod to facilitate preemptive resource occupation and implement quick scale-out. If you have not yet created a PriorityClass, you can refer to the following sample to create one:apiVersion: scheduling.k8s.io/v1kind: PriorityClassmetadata:name: high-priorityvalue: 1000000globalDefault: falsedescription: "high priority class"
priorityClassName
to a high PriorityClass. Below is a sample:apiVersion: apps/v1kind: Deploymentmetadata:name: nginxspec:replicas: 8selector:matchLabels:app: nginxtemplate:metadata:labels:app: nginxspec:priorityClassName: high-priority # Specify a high PriorityClass here.containers:- name: nginximage: nginxresources:requests:cpu: 400mMEM: 800Mi
tke-autoscaling-placeholder
and schedule the resources. At this time, the status of the tke-autoscaling-placeholder
pods changes to Pending. Below is a sample:$ kubectl get pod -n defaultNAME READY STATUS RESTARTS AGEnginx-bf79bbc8b-5kxcw 1/1 Running 0 23snginx-bf79bbc8b-5xhbx 1/1 Running 0 23snginx-bf79bbc8b-bmzff 1/1 Running 0 23snginx-bf79bbc8b-l2vht 1/1 Running 0 23snginx-bf79bbc8b-q84jq 1/1 Running 0 23snginx-bf79bbc8b-tq2sx 1/1 Running 0 23snginx-bf79bbc8b-tqgxg 1/1 Running 0 23snginx-bf79bbc8b-wz5w5 1/1 Running 0 23stke-autoscaling-placeholder-b58fd9d5d-255r8 0/1 Pending 0 23stke-autoscaling-placeholder-b58fd9d5d-4vt8r 0/1 Pending 0 23stke-autoscaling-placeholder-b58fd9d5d-55jw7 1/1 Running 0 94mtke-autoscaling-placeholder-b58fd9d5d-7c95t 1/1 Running 0 94mtke-autoscaling-placeholder-b58fd9d5d-ph7vl 1/1 Running 0 94mtke-autoscaling-placeholder-b58fd9d5d-qjrsx 0/1 Pending 0 23stke-autoscaling-placeholder-b58fd9d5d-t5qdm 0/1 Pending 0 23stke-autoscaling-placeholder-b58fd9d5d-tgvmw 0/1 Pending 0 23stke-autoscaling-placeholder-b58fd9d5d-xmrmv 1/1 Running 0 94mtke-autoscaling-placeholder-b58fd9d5d-zxtwp 0/1 Pending 0 23s
tke-autoscaling-placeholder
tool for implementing scaling in seconds. It takes advantage of pod priorities and the preemptive occupation feature to pre-deploy some low-priority "empty pods" to occupy resources, which become buffer resources. Then, in the event of a traffic spike that results in insufficient cluster resources, the resources of these low-priority "empty pods" can be occupied while triggering node scale-out at the same time. In this way, scaling can be implemented in seconds even in the case of resource shortages, and normal business operation will not be affected.
Apakah halaman ini membantu?