What Is the Reason for Configuring the Request/Limit for Containers?
When running, containers typically consume CPU/memory resources. In case of no configurations, the upper limit of resources available for a container is the number of allocable resources on the current node.
Generally, a node runs many containers.
Assuming that a node has only one container, any unused resources of the node are wasted when the container does not fully consume them. A modern personal computer can usually run hundreds of processes. Similarly, a node usually runs many containers, which introduces another problem that containers may compete for resources, but resources of the node are fixed.
The Limit is required to control the maximum resource usage of a container.
In Kubernetes, the Limit is used to define the maximum resource usage of a container. If a container requests more resources than its Limit, its usage will be restricted or it might even be evicted to other nodes.
The Request is required to guarantee the minimum resource usage of a container.
Is it enough to only configure the Limit to control the maximum resource usage of a container? Imagine that if 100 containers are running on a 10-core node, but each container needs at least 1 core to start and run normally, none of the containers on the node will run properly. Therefore, Kubernetes uses the Request to guarantee the minimum resource supply for containers.
In summary, Kubernetes uses the Request and Limit to guarantee and restrict the CPU/memory resource usage of a container.
What Are the Units of CPU and Memory?
CPU
The default unit of CPU is cores. You can also use decimal values for the number of CPU resources. When you define the CPU Request of a container as 0.5, the requested CPU is half of that for the 1.0-core CPU Request. Additionally, 0.5 is equivalent to 500m, which can be considered as 500 millicpu and read as five hundred milli-cores.
Memory
The default unit of memory is bytes. You can also use plain integers or add unit suffixes to represent the memory. For example, the following expressions indicate roughly the same value: 128974848, 129e6, 129M, 128974848000m, and 123Mi.
Note:
You should note the case sensitivity of the suffixes. For example, if you request 400m for temporary storage, the actual requested value is 0.4 bytes. Similarly, if you request 400Mi or 400M bytes, the actual requested value is also 0.4 bytes.
How Do I Understand the Request/Limit?
The Request and Limit in Kubernetes are implemented by means of CPU Share and CPU Quota.
CPU Share
Assuming that multiple containers are running on a single machine, to know how resources are allocated among them, you need to understand the concept of CPU Shares.
CPU Shares are a feature of Linux Control Groups (cgroups) that controls the CPU time available for processes in a container.
The CPU time is the amount of time a CPU spends processing instructions from a computer program or operating system, rather than time in actual daily life. For example, when a process is interrupted, suspended, or sleeping, the CPU time does not increase; but when the process resumes, the CPU time continues to increase from the point of interruption.
Characteristics of CPU Shares
1. CPU Shares are a relative concept but not an absolute measure.
The CPU Shares of containers are used as relative values to schedule the CPU time among different containers. The number itself has no meaning in isolation. For example, configuring the CPU Shares of container A as 512 does not provide information about how much CPU time the container will obtain. If the CPU Shares of another container B is configured as 1024 at the same time, it means container B will obtain twice the CPU time of container A. In other words, it still does not provide information about the actual CPU time each container will obtain, but only provides their relative amounts. If A and B run on a 3-core device simultaneously, they will obtain 1 core and 2 cores respectively at each moment theoretically. If running on a 6-core device, they will obtain 2 cores and 4 cores respectively. If running on a 0.3-core device, they will obtain 0.1 cores and 0.2 cores respectively.
2. CPU Shares take effect only when resource contention occurs.
Only when both container A and container B attempt to run at the same time, will the CPU cores be allocated based on the CPU Shares values you set.
If only one container is running, the container can use all the CPU.
If multiple containers are running simultaneously, the CPU cores on the node will be allocated based on the configured CPU Shares values.
Even if CPU Shares are configured for non-running containers, they will not affect the allocation of the CPU time for running containers.
3. The purpose of CPU Shares is to maximize the utilization of CPU resources.
Any container may use all CPU resources on a node regardless of its CPU Shares value.
When CPU contention occurs, the CPU Shares value can be used to determine how much CPU time each container should obtain.
CPU Quota
CPU Quota is used to restrict the maximum resource usage of a container. Even if there are remaining resources on a node, the container still cannot use more resources than its CPU Quota value.
CPU Share in Kubernetes
The Request and Limit in Kubernetes are implemented by means of CPU Share and CPU Quota. However, the Request has more meanings:
1. The Request is an absolute value but not a relative value, which can guarantee the minimum available resources for a container.
2. The Request is used for the scheduler to determine the optimal node for scheduling the current Pod among nodes with remaining available resources greater than its Request.
3. When resource contention occurs, the CPU is allocated based on the relative values of CPU Shares.
Pod Without a Request
A Pod without a Request can be scheduled to any node, as the remaining available resources of any node can meet the requirements of this Pod. However, in case of contention, it will not obtain any resources and may indefinitely lack resources.
Actual Application
Create an interactive busybox Pod in a terminal:
kubectl run -i --tty --rm busybox \\
--image=busybox \\
--restart=Never \\
--requests='cpu=50m,memory=50Mi' -- sh
Use the following command in the interactive section to allocate all the available CPU and memory on the current node to this Pod:
while true; do true; done
dd if=/dev/zero of=/dev/shm/fill bs=1k count=1024k
Check the resource usage of the current Pod in another terminal:
kubectl top pods
NAME CPU(cores) MEMORY(bytes)
busybox 460m 65Mi
It can be found that both the CPU and memory usage of the current busybox Pod exceed the Request. However, the busybox Pod running an infinite loop program should consume all available CPU resources theoretically. Why does it only consume 460m? Because the cluster includes other Pods and processes that compete for CPU resources through CPU Share.
Summary
In Kubernetes, the Request is used to guarantee the minimum resource usage of a container.
The Limit is used to restrict the maximum resource usage of a container.
When you input values, pay attention to the default units of CPU and memory, which are respectively cores and bytes.
When resource contention occurs, Kubernetes allocates resources based on the proportions of the Request amounts of different containers.
Was this page helpful?