Kubernetes Vertical Pod Autoscaler (VPA) can automatically adjust the reserved CPU and memory of Pod, improve cluster resource utilization and release CPU and memory for other Pods. This document describes how to use the VPA community edition in TKE to implement the scaling up and scaling down of Pods.
The auto-scaling feature of VPA makes the TKE very flexible and adaptive. When the business load increases sharply, VPA can quickly increase the Request of the container within the user's setting range. When the business load decreases, VPA can appropriately reduce the Request based on the actual needs to save computing resources. The entire process is automated without manual intervention. It is suitable for scenarios that require rapid expansion and stateful application expansion. In addition, VPA can be used to recommend a more reasonable Request to user, and improve the resource utilization of the container while ensuring that the container has sufficient available resources.
Compared with Horizontal Pod Autoscaler (HPA), VPA has the following advantages:
Note:VPA community edition is in testing. Use this feature with caution. We recommend setting "updateMode" to "Off" to ensure that VPA will not automatically change the value of Request. You can still view the recommended value of request bound to the load in the VPA object.
Auto
mode is equivalent to the Initial
mode.For more limitations on VPA, see VPA Known limitations.
sh
git clone https://github.com/kubernetes/autoscaler.git
vertical-pod-autoscaler
directory.
cd autoscaler/vertical-pod-autoscaler/
./hack/vpa-down.sh
./hack/vpa-up.sh
kubectl get deploy -n kube-system | grep vpa
After successfully creating the VPA component, you can check the three Deployments in the kube-system namespace, namely vpa-admission-controller, vpa-recommender, and vpa-updater, as shown below:Note:
- We do not recommend using VPA to automatically update Request in a production environment.
- You can use VPA to view the recommended value of Request and manually trigger the update as needed.
In this sample, you will create a VPA object with updateMode
set to Off
and create a Deployment with two Pods, and each Pod has a container. After the Pod is created, VPA will analyze the CPU and memory requirements of the container and record the recommended value of Request in the status
field. VPA will not automatically update the resource requests of the running containers.
Run the following command in kubectl to generate a VPA object named tke-vpa
, pointing to a Deployment named tke-deployment
:
shell
cat <<EOF | kubectl apply -f -
apiVersion: autoscaling.k8s.io/v1
kind: VerticalPodAutoscaler
metadata:
name: tke-vpa
spec:
targetRef:
apiVersion: "apps/v1"
kind: Deployment
name: tke-deployment
updatePolicy:
updateMode: "Off"
EOF
Run the following command to generate a Deployment object named tke-deployment
:
shell
cat <<EOF | kubectl apply -f -
apiVersion: apps/v1
kind: Deployment
metadata:
name: tke-deployment
spec:
replicas: 2
selector:
matchLabels:
app: tke-deployment
template:
metadata:
labels:
app: tke-deployment
spec:
containers:
- name: tke-container
image: nginx
EOF
The generated Deployment object is show as follows:
Note:The
tke-deployment
created above does not set the Request of CPU or memory, and the Qos of the Pod is set to BestEffort. In this case, Pod is easy to be evicted. We recommend that you set the Request and Limit when creating the Deployment of the application. If you create a workload via the TKE console, the default Request and Limit of each container will be automatically set.
Run the following command to view the recommended Requests of CPU and memory by VPA:
shell
kubectl get vpa tke-vpa -o yaml
The execution results are as follows:
yaml
...
recommendation:
containerRecommendations:
- containerName: tke-container
lowerBound:
cpu: 25m
memory: 262144k
target:# Recommended value
cpu: 25m
memory: 262144k
uncappedTarget:
cpu: 25m
memory: 262144k
upperBound:
cpu: 1771m
memory: 1851500k
The CPU and memory corresponding to target
are the recommended Requests. You can remove the previous Deployment and create a new Deployment with the recommended Request.
Field | Description |
---|---|
lowerBound | The minimum value recommended. The use of a Request smaller than this value may have a major impact on performance or availability. |
target | Recommended value. The VPA calculates the most appropriate Request. |
uncappedTarget | The latest recommended value. It is only based on the actual resource usage and does not consider the recommended value range of the container set in .spec.resourcePolicy.containerPolicies . The uncappedTarget may differ from the recommended lowerBound and upperBound . This field is only used to indicate the status and will not affect the actual resource allocation. |
upperBound | The maximum value recommended. The use of a Request larger than this value may cause a resource waste. |
If there are multiple containers in the Pod, for example, one is an application container and the other is a secondary container. You can choose to stop recommending Request for the secondary container to save the cluster resources.
In this sample, you will create a VPA with a specific container disabled, and create a Deployment with a Pod, and the Pod contains two containers. After the Pod is created, VPA only creates and calculates the recommended value for one container, and stops recommending Request for the other container.
Run the following command in the kubectl to generate a VPA object named tke-opt-vpa
, pointing to a Deployment named tke-opt-deployment
:
shell
cat <<EOF | kubectl apply -f -
apiVersion: autoscaling.k8s.io/v1
kind: VerticalPodAutoscaler
metadata:
name: tke-opt-vpa
spec:
targetRef:
apiVersion: "apps/v1"
kind: Deployment
name: tke-opt-deployment
updatePolicy:
updateMode: "Off"
resourcePolicy:
containerPolicies:
- containerName: tke-opt-sidecar
mode: "Off"
EOF
Note:In the
.spec.resourcePolicy.containerPolicies
of the VPA, themode
oftke-opt-sidecar
is set to "Off", and VPA will not calculate and recommend a new Request fortke-opt-sidecar
.
Run the following command to generate a Deployment object named tke-deployment
:
sh
cat <<EOF | kubectl apply -f -
apiVersion: apps/v1
kind: Deployment
metadata:
name: tke-opt-deployment
spec:
replicas: 1
selector:
matchLabels:
app: tke-opt-deployment
template:
metadata:
labels:
app: tke-opt-deployment
spec:
containers:
- name: tke-opt-container
image: nginx
- name: tke-opt-sidecar
image: busybox
command: ["sh","-c","while true; do echo TKE VPA; sleep 60; done"]
EOF
The generated Deployment object is show as follows:
Run the following command to view the recommended Requests of CPU and memory by VPA:
shell
kubectl get vpa tke-opt-vpa -o yaml
The execution results are as follows:
yaml
...
recommendation:
containerRecommendations:
- containerName: tke-opt-container
lowerBound:
cpu: 25m
memory: 262144k
target:
cpu: 25m
memory: 262144k
uncappedTarget:
cpu: 25m
memory: 262144k
upperBound:
cpu: 1595m
memory: 1667500k
In the execution result, there is only the recommended value of tke-opt-container
, and no recommended value of tke-opt-sidecar
.
Note:Automatic updating the resources of the running Pods is an experimental feature of VPA. We recommend that you do not use this feature in a production environment.
In this sample, you will create a VPA that can automatically adjust the CPU and memory Requests, and create a Deployment with two Pods. Each Pod will set the Request and Limit of the resource.
Run the following command in the kubectl to generate a VPA object named tke-auto-vpa
, pointing to a Deployment named tke-auto-deployment
:
yaml
cat <<EOF | kubectl apply -f -
apiVersion: autoscaling.k8s.io/v1
kind: VerticalPodAutoscaler
metadata:
name: tke-auto-vpa
spec:
targetRef:
apiVersion: "apps/v1"
kind: Deployment
name: tke-auto-deployment
updatePolicy:
updateMode: "Auto"
EOF
Note:The
updateMode
field of this VPA is set toAuto
, which means that the VPA can update the CPU and memory Requests during the life cycle of the Pod. VPA can remove the Pod, adjust the CPU and memory Requests, and then rebuild a Pod.
Run the following command to generate a Deployment object named tke-auto-deployment
:
shell
cat <<EOF | kubectl apply -f -
apiVersion: apps/v1
kind: Deployment
metadata:
name: tke-auto-deployment
spec:
replicas: 2
selector:
matchLabels:
app: tke-auto-deployment
template:
metadata:
labels:
app: tke-auto-deployment
spec:
containers:
- name: tke-container
image: nginx
resources:
requests:
cpu: 100m
memory: 100Mi
limits:
cpu: 200m
memory: 200Mi
EOF
Note:When the Deployment is created in the above operation, the Request and Limit of the resource have been set. In this case, VPA will not only recommend the Request, but also automatically recommend the Limit based on the initial ratio of Request and Limit. For example, the initial ratio of CPU’s Request and Limit in YAML is 100m:200m, namely 1:2, then the value of Limit recommended by VPA is twice the value of Request recommended in the VPA object.
The generated Deployment object is show as follows:
Run the following command to obtain the detailed information of the running Pod:
sh
kubectl get pod pod-name -o yaml
The execution result is shown below. VPA modified the original Request and Limits to the recommended value of VPA, and maintained the initial ratio of Request and Limits. At the same time, an annotation that recorded the updates is generated:
yaml
apiVersion: v1
kind: Pod
metadata:
annotations:
...
vpaObservedContainers: tke-container
vpaUpdates: Pod resources updated by tke-auto-vpa: container 0: memory request, cpu request
...
spec:
containers:
...
resources:
limits:# The new Request and Limits will maintain the initial ratio
cpu: 50m
memory: 500Mi
requests:
cpu: 25m
memory: 262144k
...
Run the following command to obtain the detailed information of the relevant VPA:
sh
kubectl get vpa tke-auto-vpa -o yaml
The execution results are as follows:
yaml
...
recommendation:
containerRecommendations:
- containerName: tke-container
Lower Bound:
Cpu: 25m
Memory: 262144k
Target:
Cpu: 25m
Memory: 262144k
Uncapped Target:
Cpu: 25m
Memory: 262144k
Upper Bound:
Cpu: 101m
Memory: 262144k
target
means that the container will run in the best state when the Requests of CPU and memory are 25m and 262144k respectively.
VPA uses the recommended values of lowerBound
and upperBound
to decide whether to evict a Pod and replace it with a new Pod. If the Pod’s Request is smaller than the lower limit or larger than the upper limit, VPA will remove the Pod and replace it with a Pod with a recommended value.
vpa-up.sh
script.shell
ERROR: Failed to create CA certificate for self-signing. If the error is "unknown option -addext", update your openssl version or deploy VPA from the vpa-release-0.8 branch.
openssl
version of the cluster CVM is later than v1.1.1.vpa-release-0.8
branch of the Autoscaler project is used.If the VPA-related load fails to start up, and the following message is generated:
Message 1: indicates that the Pods in the load fail to run.
Message 2: indicates the address of the image.
The VPA-related load could not be started up because the image located in GCR could not be downloaded. You can try the following steps to solve the problem:
Was this page helpful?