Principles for Adding Preset Resources When Scale-out Rules Are Executed
Each cluster can configure up to 10 types of scaling specifications. When the scale-out rule is triggered, the scale-out will be executed based on the priority of the specifications. If the high-priority specification resources are insufficient, the sub-priority resources will be combined with high-priority resources to supplement the calculation resources (following the same order for pay-as-you-go and spot instances).
When resources are sufficient: 1 > 2 > 3 > 4 > 5
Example:
When 5 types of specifications are preset and resources are sufficient, if the scale-out rule is triggered to scale out 10 nodes, 10 nodes will be scaled out based on specification 1 in sequence, and the other preset specifications will not be selected.
When resources are insufficient: 1+2 > 1+2+3 > 1+2+3+4 > 1+2+3+4+5
Example:
When preset specification 1 has 8 nodes, specification 2 has 4 nodes, and specification 3 has 3 nodes, if the scale-out rule triggers the need to scale out 13 nodes, 8 nodes will be scaled out based on specification 1, 4 nodes will be scaled out based on specification 2, and 1 node will be scaled out based on specification 3 in the sequence.
When the resource specification is out of stock, assuming specification 2 is unavailable: 1+3 > 1+3+4 > 1+3+4+5.
Example:
When preset specification 1 has 8 nodes, specification 2 is unavailable, and specification 3 has 3 nodes, if the scale-out rule is triggered to scale out 10 nodes, 8 nodes will be scaled out based on specification 1, specification 2 will not be selected, and 2 nodes will be scaled out based on specification 3 in the sequence.
When preset specification 1 has 8 nodes and all other preset specifications are unavailable, if the scale-out rule is triggered to scale out 10 nodes, 8 nodes will be scaled out based on specification 1, with partial success in scale-out.
Scale-out methods: You can choose from nodes, memory, or cores. All three methods only support non-zero integer input. When you select cores or memory as the method, the scale-out process ensures maximum computing power by converting the node quantity accordingly.
Example:
When you scale out by cores, if the scale-out rule is set to 10 cores but the specification priority is for 8-core nodes, the rule will trigger the scale-out of two 8-core nodes.
When you scale out by memory, if the scale-out rule is set to 20 GB but the specification priority is for 16 GB nodes, the rule will trigger the scale-out of two 16 GB nodes.
Principles for Scaling in Elastic Nodes When Scale-in Rules Are Executed
Elastic nodes added through the auto-scaling feature will prioritize scaling in idle nodes first, following the principle of scaling in nodes in reverse order of creation time. If the required scale-in number is not met, nodes running containers will then be selected for scale-in. For load-based scale-in, nodes running services with load metrics will be prioritized, and idle nodes will be scaled in first. This also follows the principle, and if the required scale-in number is not met, nodes running containers will be selected for scale-in. Non-elastic nodes will not be affected by scale-in rules, and scale-in actions will not be triggered. Non-elastic nodes only support manual scale-in.
Note
Scheduled termination of nodes will not be constrained by the principles of scaling in nodes in reverse order of creation time and minimum number of cluster nodes. The scale-in will be executed once the set time is reached, with a default graceful scale-in period of 30 minutes.
The criteria for determining an idle node is that there are no running containers within the last 5 minutes.
Scale in based on load, assuming the node creation time is from earliest to latest: A > B > C > D > E.
Example:
When you set YARN load metric scale-in for 5 nodes, with C, D, and E deployed with YARN components and D and E running containers, the scale-in order when the rule is triggered will be C > E > D > B > A.
When you set Trino load metric scale-in for 5 nodes, with C, D, and E deployed with Trino components and D and B running containers, the scale-in order will be E > C > D > A > B, when the rule is triggered.
Scale in based on time, assuming the node creation time is from earliest to latest: A > B > C > D > E.
Example:
When you set node-based scale-in for 5 nodes, with D and E running containers, the scale-in order will be C > B > A > E > D, when the rule is triggered.
Scale-in methods: Support for three options including nodes, memory, and cores. Only non-zero integer values are allowed for all three options. When you select cores or memory, the scale-in process ensures business continuity by calculating the minimum number of nodes required for scale-in. If no tasks are running on the nodes, they will be scaled in following a reverse chronological order, ensuring at least one node is scaled in.
Example:
When you scale in by cores, setting a scale-in of 20 cores. When the scale-in rule is triggered, with the cluster having elastic nodes consisting of three 8-core 16 GB nodes and two 4-core 8 GB nodes (in reverse chronological order), the system will successfully scale in two 8-core 16 GB nodes.
When you scale in by memory, setting a scale-in of 30 GB. When the scale-in rule is triggered, with the cluster having elastic nodes consisting of three 8-core 16 GB nodes and two 4-core 8 GB nodes (in reverse chronological order), the system will successfully scale in one 8-core 16 GB node.
Principles for Triggering and Executing Scaling Rules
Elastic scaling rules can be set based on both time and load metrics. The rules follow the first triggered, first executed principle, and if multiple rules are triggered simultaneously, they are executed based on their priority order. The rule status indicates whether the rule is active or not. By default, it is enabled, but the status can be set to disabled when you want to keep the configuration without executing the rule.
Scaling based on load only.
1.1 Follows the first triggered, first executed, and if multiple rules are triggered simultaneously, they are executed based on their priority order principle, such as 1 > 2 > 3 > 4 > 5.
1.2 A single load-based scaling rule can support multiple metrics. The rule is triggered when all metrics meet the conditions.
1.3 Load-based scaling can be set to monitor cluster load changes within a specific time period.
Scaling based on time only.
1.1 Follows the first triggered, first executed, and if multiple rules are triggered simultaneously, they are executed based on their priority order principle, such as 1 > 2 > 3 > 4 > 5.
1.2 The rule can be set to execute repeatedly. Once the rule expires, it becomes inactive. Alarms will be sent before expiration; see the alarm configuration. Scaling based on both load and time.
Follows the first triggered, first executed, and if multiple rules are triggered simultaneously, they are executed based on their priority order principle, such as 1 > 2 > 3 > 4 > 5.
Corresponding Relationships of Queue Load Metrics
|
YARN | AvailableVCores | root | AvailableVCores#root | Number of available virtual cores in the Root queue |
|
| root.default | AvailableVCores#root.default | Number of available virtual cores in the root.default queue |
|
| Custom sub-queue | For example: AvailableVCores#root.test | Number of available virtual cores in the root.test queue |
| PendingVCores | root | PendingVCores#root | Number of virtual cores needed for upcoming tasks in the Root queue |
|
| root.default | PendingVCores#root.default | Number of virtual cores needed for upcoming tasks in the root.default queue |
|
| Custom sub-queue | For example: PendingVCores#root.test | Number of virtual cores needed for upcoming tasks in the root.test queue |
| AvailableMB | root | AvailableMB#root | Available memory in the Root queue (MB) |
|
| root.default | AvailableMB#root.default | Available memory in the root.default queue (MB) |
|
| Custom sub-queue | For example: AvailableMB#root.test | Available memory in the root.test queue (MB) |
| PendingMB | root | PendingMB#root | Available memory needed for upcoming tasks in the Root queue (MB) |
|
| root.default | PendingMB#root.default | Available memory needed for upcoming tasks in the root.default queue (MB) |
|
| Custom sub-queue | For example: PendingMB#root.test | Available memory needed for upcoming tasks in the root.test queue (MB) |
| AvailableMemPercentage | Cluster | AvailableMemPercentage | Percentage of available memory |
| ContainerPendingRatio | Cluster | ContainerPendingRatio | Ratio of pending containers to allocated containers |
| AppsRunning | root | AppsRunning#root | Number of running tasks in the root queue |
|
| root.default | AppsRunning#root.default | Number of tasks running in the root.default queue |
|
| Custom sub-queue | For example: AppsRunning#root.test | Number of tasks running in the root.test queue |
| AppsPending | root | AppsPending#root | Number of pending tasks in the root queue |
|
| root.default | AppsPending#root.default | Number of pending tasks in the root.default queue |
|
| Custom sub-queue | For example: AppsPending#root.test | Number of pending tasks in the root.test queue |
| PendingContainers | root | PendingContainers#root | Number of pending containers in the root queue |
|
| root.default | PendingContainers#root.default | Number of pending containers in the root.default queue |
|
| Custom sub-queue | For example: PendingContainers#root.test | Number of pending containers in the root.test queue |
| AllocatedMB | root | AllocatedMB#root | Allocated memory in the root queue |
|
| root.default | AllocatedMB#root.default | Allocated memory in the root.default queue |
|
| Custom sub-queue | For example: AllocatedMB#root.test | Allocated memory in the root.test queue |
| AllocatedVCores | root | AllocatedVCores#root | Number of virtual cores allocated to the root queue |
|
| root.default | AllocatedVCores#root.default | Number of virtual cores allocated to the root.default queue |
|
| Custom sub-queue | For example: AllocatedVCores#root.test | Number of virtual cores allocated to the root.test queue |
| ReservedVCores | root | ReservedVCores#root | Number of virtual cores reserved in the root queue |
|
| root.default | ReservedVCores#root.default | Number of virtual cores reserved in the root.default queue |
|
| Custom sub-queue | For example: ReservedVCores#root.test | Number of virtual cores reserved in the root.test queue |
| AllocatedContainers | root | AllocatedContainers#root | Number of containers allocated in the root queue |
|
| root.default | AllocatedContainers#root.default | Number of containers allocated in the root.default queue |
|
| Custom sub-queue | For example: AllocatedContainers#root.test | Number of containers allocated in the root.test queue |
| ReservedMB | root | ReservedMB#root | Amount of memory reserved in the root queue |
|
| root.default | ReservedMB#root.default | Amount of memory reserved in the root.default queue |
|
| Sub Queue Definition | e.g., ReservedMB#root.test | Amount of Reserved Memory in the root.test queue |
| AppsKilled | root | AppsKilled#root | Number of terminated tasks in the root queue |
|
| root.default | AppsKilled#root.default | Number of terminated tasks in the root.default queue |
|
| Sub Queue Definition | e.g., AppsKilled#root.test | Number of terminated tasks in the root.test queue |
| AppsFailed | root | AppsFailed#root | Number of failed tasks in the root queue |
|
| root.default | AppsFailed#root.default | Number of failed tasks in the root.default queue |
|
| Sub Queue Definition | For example: AppsFailed#root.test | Number of failed tasks in the root.test queue |
| AppsCompleted | root | AppsCompleted#root | Number of completed tasks in the root queue |
|
| root.default | AppsCompleted#root.default | Number of completed tasks in the root.default queue |
|
| Sub Queue Definition | e.g., AppsCompleted#root.test | Number of completed tasks in the root.test queue |
| AppsSubmitted | root | AppsSubmitted#root | Number of tasks submitted to the root queue |
|
| root.default | AppsSubmitted#root.default | Number of tasks submitted to the root.default queue |
|
| Sub Queue Definition | e.g., AppsSubmitted#root.test | Number of tasks submitted in the root.test queue |
| AvailableVCoresPercentage | Cluster | Cluster | AvailableVCoresPercentage |
Percentage of available virtual cores in the cluster | MemPendingRatio | root | MemPendingRatio#root |
Percentage of available memory waiting in the root queue
|
Percentage of available memory waiting in the root queue |
| root.default | MemPendingRatio#root.default |
Percentage of available memory waiting in the root.default queue
|
Percentage of available memory waiting in the root.default queue |
| Sub Queue Definition | e.g., MemPendingRatio#root.test |
Percentage of available memory waiting in the root.test queue
|
Trino | FreeDistributed | Cluster | FreeDistributed |
Available Distributed memory in the cluster
|
Available Distributed memory in the cluster | QueuedQueries | Cluster | QueuedQueries |
Total number of queries waiting to be executed in the queue
|
Was this page helpful?