Overview

Overview
Component Overview
Kubernetes' scheduling logic operates based on the Pod's Request. The schedulable resources on the node are occupied by the Pod's Request amount and cannot free up. The native node dedicated scheduler is a scheduling plugin developed by Tencent Kubernetes Engine (TKE) based on the native Kube-scheduler Extender mechanism of Kubernetes, which can virtually magnify the capacity of the node, resolving the issue of the node's resources being occupied while maintaining a low utilization rate.
Kubernetes objects deployed in a cluster
Kubernetes Object Name
Type
Requested Resource
Belonging Namespace
crane-scheduler-controller
Deployment
Each instance is endowed with 200m CPU and 200Mi memory, totaling one instance
kube-system
crane-descheduler
Deployment
Each instance is endowed with 200m CPU and 200Mi memory, totaling one instance
kube-system
crane-scheduler
Deployment
Each instance is endowed with 200m CPU and 200Mi memory, totaling three instances
kube-system
crane-scheduler-controller
Service
-
kube-system
crane-scheduler
Service
-
kube-system
crane-scheduler
ClusterRole
-
kube-system
crane-descheduler
ClusterRole
-
kube-system
crane-scheduler
ClusterRoleBinding
-
kube-system
crane-descheduler
ClusterRoleBinding
-
kube-system
crane-scheduler-policy
ConfigMap
-
kube-system
crane-descheduler-policy
ConfigMap
-
kube-system
ClusterNodeResourcePolicy
CRD
-
-
CraneSchedulerConfiguration
CRD
-
-
NodeResourcePolicy
CRD
-
-
crane-scheduler-controller-mutating-webhook
MutatingWebhookConfiguration
-
-
Application Scenarios
Scenario 1: Resolving the issue of high node box rate but low utilization
Note:
The fundamental concepts are as follows.
Box Rate: It refers to the ratio of the sum of Requests of all Pods on a node to the actual specifications of the node.
Utilization: It refers to the ratio of the total actual usage of all Pods on a node to the actual specifications of the node.
The native Kubernetes scheduler schedules based on the Request resources of Pod. Therefore, even if the actual usage on the node is low at this time, if the sum of Requests of all Pods on the node is close to the actual specifications of the node, new Pods cannot be scheduled, resulting in substantial resource waste. Moreover, businesses tend to apply for surplus resources to ensure the stability of their services, that is, a large Request, leading to the occupation of node resources, unable to free up. At this point, the node's box rate is substantial, but the actual resource utilization is comparatively low.
At such times, you can use the dedicated native node scheduler to virtually enhance the specifications of CPU and memory on a node, thus amplifying its scheduler resources. More pods can thereby be scheduled. 
Scenario 2: Setting the watermark of the nodes
The watermark setting of the node is to ensure the stability of the node and set the node's target utilization rate:
Control of the watermark during scheduling: This step determines the native node's target resource utilization rate to guarantee stability. While scheduling the Pods, nodes with resources above this watermark will not be selected. Moreover, from nodes meeting the watermark requirements, as shown in the following figure, nodes with lower actual load watermarks have priority to balance the cluster node's utilization distribution.
Control of the watermark during runtime: This step determines the current target resource utilization rate for native nodes to guarantee stability. At runtime, nodes with resources above this watermark could trigger evictions. Given that eviction is a high-risk action, bear in mind the following notes.
Notes
1. To avoid draining important Pods, this feature is set not to evict Pods by default. For Pods that can be safely drained, it is essential for users to explicitly determine the workload to which the Pod belongs. For example, StatefulSet, Deployment, and other objects can be set as drainable annotations:
descheduler.alpha.kubernetes.io/evictable: 'true'
2. It is recommended to enable event persistence for the cluster, to better monitor component abnormalities and troubleshoot. When evicting a Pod, corresponding events will be generated. You can observe if the Pod is being repeatedly evicted based on the Descheduled event.
3. The eviction action has requirements for nodes: a cluster is required to have 3 or more low-load native nodes, where a low-load definition refers to a Node's load that is lesser than its operational water-level control.
4. After filtering at the node dimension, evacuation begins on the workload on the Node. This necessitates the constraint that the replica count of the workload should be equal to or greater than 2, or at least half of the Workload spec replicas.
5. At the Pod dimension level, if a Pod's load exceeds the eviction watermark of the node, eviction is forbidden to prevent the overloading of other nodes by relocating them there.
Scenario 3: Pods under specified Namespace shall be allocated only to native nodes upon the subsequent scheduling
Native nodes, the newly-launched node types, are introduced by the TKE Tencent Kubernetes Engine team of Tencent Cloud. They are built upon the technical excellence derived from Tencent Cloud's tens of millions of core container operations, thereby delivering native-like, high-stability, and rapid-response K8s node management capabilities. Native nodes, with amplifiable node specifications and recommended Request capabilities, are hence highly advisable for exploiting its advantages fully by scheduling your workload to them. While enabling the native node scheduler, you can opt for Namespace. Consequently, Pods under the specified Namespace shall be scheduled exclusively to native nodes in the following scheduling.
Note:
If the native node resources are insufficient at this stage, it would result in Pod Pending.
Limits
This feature is only supported by the native node. For more information, see Native Node Overview.
It is required to ensure that the Kubernetes version is v1.22.5-tke.8, v1.20.6-tke.24, v1.18.4-tke.28,v1.16.3-tke.30 or higher. For cluster versions upgrade, see Upgrading a Cluster.
Risk Control
After the uninstallation of this component, only the scheduling logic associated with the native node-dedicated scheduler will be eliminated, leaving the scheduling capability of the native Kube-Scheduler untouched. The already scheduled Pods on the native node will not be affected due to their pre-set schedule. However, a reboot of kubelet on the native node might trigger Pod eviction as the sum of Pods' Requests on the native node could exceed the genuine specifications of the native node.
In the event of the amplification coefficient being adjusted downwards, the existing Pods on the native node, due to their already prescribed schedule, will remain unaffected. Nonetheless, if the kubelet on the native node restarts, it might trigger Pod eviction since the aggregate of Pods' Requests on the native node could surpass the amplified specifications of the native node after the amplification.
Users witness the inconsistency between Node resources in the Kubernetes cluster and corresponding CVM node resources.
In the future, issues related to excessive load and instability could possibly arise.
After the amplification of the node specifications, the node kubelet layer and the resource QoS-related modules might be affected. For instance, kubelet's binding cores, when a 4-core node is treated as an 8-core node for scheduling, the Pods' binding cores could possibly be impacted.
Component Permission Description
Crane Scheduler Permission
Permission Description
The permission of this component is the minimal dependency required for the current feature to operate.
Permission Scenarios
Feature
Involved Object
Involved Operation Permission
It is required to keep track of the updates and changes to the node, as well as the utilization of the access node.
nodes
get/watch/list
Track the updates and changes of pods, and determine the scheduling priority of nodes based on the recent scheduling situation of pods within the cluster.
pods/namespaces
get/watch/list
It is required to update node utilization to node resources, thereby achieving the decoupling of scheduling and query logic.
nodes/status
patch
It is required to support multiple replicas to ensure component availability.
leases
create/get/update 
It is required to track the updates and changes of the configmap, implementing the feature of scheduling specified pods to native nodes.
configmap
get/list/watch
Permission Definition
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: crane-scheduler
rules:
- apiGroups:
  - ""
  resources:
  - pods
  - nodes
  - namespaces
  verbs:
  - list
  - watch
  - get
- apiGroups:
  - ""
  resources:
  - nodes/status
  verbs:
  - patch
- apiGroups:
  - ""
  resources:
  - configmaps
  verbs:
  - get
  - list
  - watch
- apiGroups:
  - extensions
  - apps
  resources:
  - deployments/scale
  verbs:
  - get
  - update
- apiGroups:
  - coordination.k8s.io
  resources:
  - leases
  verbs:
  - create
  - get
  - update
- apiGroups:
  - "scheduling.crane.io"
  resources:
  - clusternoderesourcepolicies
  - noderesourcepolicies
  - craneschedulerconfigurations
  verbs:
  - get
  - list
  - watch
  - update
  - create
  - patch
Crane Descheduler Permission
Permission Description
The permission of this component is the minimal dependency required for the current feature to operate.
Permission Scenarios
Feature
Involved Object
Involved Operation Permission
It is required to keep track of the updates and changes to the node, as well as the utilization of the access node.
nodes
get/watch/list
Track the updates and changes of the pods, determining the pods to be evicted first based on the information of the pods within the clusters.
pods
get/watch/list
Drain the pod.
pods/eviction
create
It is required to determine whether the number of ready workloads where the pod resides constitutes half or more of the total requirements to decide whether to drain the pod.
replicasets/deployments/statefulsets/statefulsetpluses/job
get
Report events when draining Pods.
create
events
Permission Definition
kind: ClusterRole
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  name: crane-descheduler
  namespace: kube-system
rules:
  - apiGroups: [""]
    resources: ["nodes"]
    verbs: ["get", "watch", "list"]
  - apiGroups: [""]
    resources: ["pods"]
    verbs: ["get", "watch", "list"]
  - apiGroups: [""]
    resources: ["nodes/status"]
    verbs: ["patch"]
  - apiGroups: [""]
    resources: ["pods/eviction"]
    verbs: ["create"]
  - apiGroups: ["*"]
    resources: ["replicasets"]
    verbs: ["get"]
  - apiGroups: ["*"]
    resources: ["deployments"]
    verbs: ["get"]
  - apiGroups: ["apps"]
    resources: ["statefulsets"]
    verbs: ["get"]
  - apiGroups: ["platform.stke"]
    resources: ["statefulsetpluses"]
    verbs: ["get"]
  - apiGroups: [""]
    resources: ["events"]
    verbs: ["create"]
  - apiGroups: ["*"]
    resources: ["jobs"]
    verbs: ["get"]
  - apiGroups: [ "coordination.k8s.io" ]
    resources: [ "leases"
﻿

Was this page helpful?

You can also Contact Sales or Submit a Ticket for help.

Yes

tencent cloud

Overview

Component Overview

Kubernetes objects deployed in a cluster

Application Scenarios

Scenario 1: Resolving the issue of high node box rate but low utilization

Scenario 2: Setting the watermark of the nodes

Notes

Scenario 3: Pods under specified Namespace shall be allocated only to native nodes upon the subsequent scheduling

Limits

Risk Control

Component Permission Description

Crane Scheduler Permission

Permission Description

Permission Scenarios

Permission Definition

Crane Descheduler Permission

Permission Description

Permission Scenarios

Permission Definition

About Tencent Cloud

Help & Support

Resources

User Center

Kubernetes Object Name	Type	Requested Resource	Belonging Namespace
crane-scheduler-controller	Deployment	Each instance is endowed with 200m CPU and 200Mi memory, totaling one instance	kube-system
crane-descheduler	Deployment	Each instance is endowed with 200m CPU and 200Mi memory, totaling one instance	kube-system
crane-scheduler	Deployment	Each instance is endowed with 200m CPU and 200Mi memory, totaling three instances	kube-system
crane-scheduler-controller	Service	-	kube-system
crane-scheduler	Service	-	kube-system
crane-scheduler	ClusterRole	-	kube-system
crane-descheduler	ClusterRole	-	kube-system
crane-scheduler	ClusterRoleBinding	-	kube-system
crane-descheduler	ClusterRoleBinding	-	kube-system
crane-scheduler-policy	ConfigMap	-	kube-system
crane-descheduler-policy	ConfigMap	-	kube-system
ClusterNodeResourcePolicy	CRD	-	-
CraneSchedulerConfiguration	CRD	-	-
NodeResourcePolicy	CRD	-	-
crane-scheduler-controller-mutating-webhook	MutatingWebhookConfiguration	-	-

Feature	Involved Object	Involved Operation Permission
It is required to keep track of the updates and changes to the node, as well as the utilization of the access node.	nodes	get/watch/list
Track the updates and changes of pods, and determine the scheduling priority of nodes based on the recent scheduling situation of pods within the cluster.	pods/namespaces	get/watch/list
It is required to update node utilization to node resources, thereby achieving the decoupling of scheduling and query logic.	nodes/status	patch
It is required to support multiple replicas to ensure component availability.	leases	create/get/update
It is required to track the updates and changes of the configmap, implementing the feature of scheduling specified pods to native nodes.	configmap	get/list/watch

tencent cloud

Sign Up

Log in

Compute

Microservice

Data Migration

Database SaaS Tool

Data Security

Application Security

Big Data

Image Creation

Internet of Things

Stream Services

Cloud Real-time Rendering

Cloud Resource Management

More

Edge Computing

Serverless

Relational Database

Networking

Business Security

Domains & Websites

Face Recognition

AI Platform Service

Middleware

Media On-Demand

Game Services

Management and Audit Tools

Container

Essential Storage Service

Enterprise Distributed DBMS

CDN and Acceleration

Security Services

Enterprise Applications

Voice Technology

Natural Language Processing

Communication

Media Process Services

Education Sevices

Developer Tools

Distributed cloud

Data Process and Analysis

NoSQL Database

Network Security

Cloud Security

Office Collaboration

Tencent Big Model

Optical Character Recognition

Interactive Video Services

Media SDK

Medical Services

Monitor and Operation

Overview

Component Overview

Kubernetes objects deployed in a cluster

Application Scenarios

Scenario 1: Resolving the issue of high node box rate but low utilization

Scenario 2: Setting the watermark of the nodes

Notes

Scenario 3: Pods under specified Namespace shall be allocated only to native nodes upon the subsequent scheduling

Limits

Risk Control

Component Permission Description

Crane Scheduler Permission

Permission Description

Permission Scenarios

Permission Definition

Crane Descheduler Permission

Permission Description

Permission Scenarios

Permission Definition

About Tencent Cloud

Help & Support

Resources

User Center