tencent cloud

Feedback

Practical Tutorial on Setting Scaling Rules

Last updated: 2024-10-30 11:50:29

    Principles for Adding Preset Resources When Scale-out Rules Are Executed

    Each cluster can configure up to 10 types of scaling specifications. When the scale-out rule is triggered, the scale-out will be executed based on the priority of the specifications. If the high-priority specification resources are insufficient, the sub-priority resources will be combined with high-priority resources to supplement the calculation resources (following the same order for pay-as-you-go and spot instances).
    When resources are sufficient: 1 > 2 > 3 > 4 > 5
    Example:
    When 5 types of specifications are preset and resources are sufficient, if the scale-out rule is triggered to scale out 10 nodes, 10 nodes will be scaled out based on specification 1 in sequence, and the other preset specifications will not be selected.
    When resources are insufficient: 1+2 > 1+2+3 > 1+2+3+4 > 1+2+3+4+5
    Example:
    When preset specification 1 has 8 nodes, specification 2 has 4 nodes, and specification 3 has 3 nodes, if the scale-out rule triggers the need to scale out 13 nodes, 8 nodes will be scaled out based on specification 1, 4 nodes will be scaled out based on specification 2, and 1 node will be scaled out based on specification 3 in the sequence.
    When the resource specification is out of stock, assuming specification 2 is unavailable: 1+3 > 1+3+4 > 1+3+4+5.
    Example:
    When preset specification 1 has 8 nodes, specification 2 is unavailable, and specification 3 has 3 nodes, if the scale-out rule is triggered to scale out 10 nodes, 8 nodes will be scaled out based on specification 1, specification 2 will not be selected, and 2 nodes will be scaled out based on specification 3 in the sequence.
    When preset specification 1 has 8 nodes and all other preset specifications are unavailable, if the scale-out rule is triggered to scale out 10 nodes, 8 nodes will be scaled out based on specification 1, with partial success in scale-out.
    Scale-out methods: You can choose from nodes, memory, or cores. All three methods only support non-zero integer input. When you select cores or memory as the method, the scale-out process ensures maximum computing power by converting the node quantity accordingly.
    Example:
    When you scale out by cores, if the scale-out rule is set to 10 cores but the specification priority is for 8-core nodes, the rule will trigger the scale-out of two 8-core nodes.
    When you scale out by memory, if the scale-out rule is set to 20 GB but the specification priority is for 16 GB nodes, the rule will trigger the scale-out of two 16 GB nodes.

    Principles for Scaling in Elastic Nodes When Scale-in Rules Are Executed

    Elastic nodes added through the auto-scaling feature will prioritize scaling in idle nodes first, following the principle of scaling in nodes in reverse order of creation time. If the required scale-in number is not met, nodes running containers will then be selected for scale-in. For load-based scale-in, nodes running services with load metrics will be prioritized, and idle nodes will be scaled in first. This also follows the principle, and if the required scale-in number is not met, nodes running containers will be selected for scale-in. Non-elastic nodes will not be affected by scale-in rules, and scale-in actions will not be triggered. Non-elastic nodes only support manual scale-in.
    Note
    Scheduled termination of nodes will not be constrained by the principles of scaling in nodes in reverse order of creation time and minimum number of cluster nodes. The scale-in will be executed once the set time is reached, with a default graceful scale-in period of 30 minutes.
    The criteria for determining an idle node is that there are no running containers within the last 5 minutes.
    Scale in based on load, assuming the node creation time is from earliest to latest: A > B > C > D > E.
    Example:
    When you set YARN load metric scale-in for 5 nodes, with C, D, and E deployed with YARN components and D and E running containers, the scale-in order when the rule is triggered will be C > E > D > B > A.
    When you set Trino load metric scale-in for 5 nodes, with C, D, and E deployed with Trino components and D and B running containers, the scale-in order will be E > C > D > A > B, when the rule is triggered.
    Scale in based on time, assuming the node creation time is from earliest to latest: A > B > C > D > E.
    Example:
    When you set node-based scale-in for 5 nodes, with D and E running containers, the scale-in order will be C > B > A > E > D, when the rule is triggered.
    Scale-in methods: Support for three options including nodes, memory, and cores. Only non-zero integer values are allowed for all three options. When you select cores or memory, the scale-in process ensures business continuity by calculating the minimum number of nodes required for scale-in. If no tasks are running on the nodes, they will be scaled in following a reverse chronological order, ensuring at least one node is scaled in.
    Example:
    When you scale in by cores, setting a scale-in of 20 cores. When the scale-in rule is triggered, with the cluster having elastic nodes consisting of three 8-core 16 GB nodes and two 4-core 8 GB nodes (in reverse chronological order), the system will successfully scale in two 8-core 16 GB nodes.
    When you scale in by memory, setting a scale-in of 30 GB. When the scale-in rule is triggered, with the cluster having elastic nodes consisting of three 8-core 16 GB nodes and two 4-core 8 GB nodes (in reverse chronological order), the system will successfully scale in one 8-core 16 GB node.

    Principles for Triggering and Executing Scaling Rules

    Elastic scaling rules can be set based on both time and load metrics. The rules follow the first triggered, first executed principle, and if multiple rules are triggered simultaneously, they are executed based on their priority order. The rule status indicates whether the rule is active or not. By default, it is enabled, but the status can be set to disabled when you want to keep the configuration without executing the rule.
    Scaling based on load only.
    1.1 Follows the first triggered, first executed, and if multiple rules are triggered simultaneously, they are executed based on their priority order principle, such as 1 > 2 > 3 > 4 > 5.
    1.2 A single load-based scaling rule can support multiple metrics. The rule is triggered when all metrics meet the conditions.
    1.3 Load-based scaling can be set to monitor cluster load changes within a specific time period.
    Scaling based on time only.
    1.1 Follows the first triggered, first executed, and if multiple rules are triggered simultaneously, they are executed based on their priority order principle, such as 1 > 2 > 3 > 4 > 5.
    1.2 The rule can be set to execute repeatedly. Once the rule expires, it becomes inactive. Alarms will be sent before expiration; see the alarm configuration.
    Scaling based on both load and time. Follows the first triggered, first executed, and if multiple rules are triggered simultaneously, they are executed based on their priority order principle, such as 1 > 2 > 3 > 4 > 5.

    Corresponding Relationships of Queue Load Metrics

    Load Type
    Category
    Dimension
    EMR Auto-scaling Metric
    Meaning of Metrics
    YARN
    AvailableVCores
    root
    AvailableVCores#root
    Number of available virtual cores in the Root queue
    root.default
    AvailableVCores#root.default
    Number of available virtual cores in the root.default queue
    Custom sub-queue
    For example: AvailableVCores#root.test
    Number of available virtual cores in the root.test queue
    PendingVCores
    root
    PendingVCores#root
    Number of virtual cores needed for upcoming tasks in the Root queue
    root.default
    PendingVCores#root.default
    Number of virtual cores needed for upcoming tasks in the root.default queue
    Custom sub-queue
    For example: PendingVCores#root.test
    Number of virtual cores needed for upcoming tasks in the root.test queue
    AvailableMB
    root
    AvailableMB#root
    Available memory in the Root queue (MB)
    root.default
    AvailableMB#root.default
    Available memory in the root.default queue (MB)
    Custom sub-queue
    For example: AvailableMB#root.test
    Available memory in the root.test queue (MB)
    PendingMB
    root
    PendingMB#root
    Available memory needed for upcoming tasks in the Root queue (MB)
    root.default
    PendingMB#root.default
    Available memory needed for upcoming tasks in the root.default queue (MB)
    Custom sub-queue
    For example: PendingMB#root.test
    Available memory needed for upcoming tasks in the root.test queue (MB)
    AvailableMemPercentage
    Cluster
    AvailableMemPercentage
    Percentage of available memory
    ContainerPendingRatio
    Cluster
    ContainerPendingRatio
    Ratio of pending containers to allocated containers
    AppsRunning
    root
    AppsRunning#root
    Number of running tasks in the root queue
    root.default
    AppsRunning#root.default
    Number of tasks running in the root.default queue
    Custom sub-queue
    For example: AppsRunning#root.test
    Number of tasks running in the root.test queue
    AppsPending
    root
    AppsPending#root
    Number of pending tasks in the root queue
    root.default
    AppsPending#root.default
    Number of pending tasks in the root.default queue
    Custom sub-queue
    For example: AppsPending#root.test
    Number of pending tasks in the root.test queue
    PendingContainers
    root
    PendingContainers#root
    Number of pending containers in the root queue
    root.default
    PendingContainers#root.default
    Number of pending containers in the root.default queue
    Custom sub-queue
    For example: PendingContainers#root.test
    Number of pending containers in the root.test queue
    AllocatedMB
    root
    AllocatedMB#root
    Allocated memory in the root queue
    root.default
    AllocatedMB#root.default
    Allocated memory in the root.default queue
    Custom sub-queue
    For example: AllocatedMB#root.test
    Allocated memory in the root.test queue
    AllocatedVCores
    root
    AllocatedVCores#root
    Number of virtual cores allocated to the root queue
    root.default
    AllocatedVCores#root.default
    Number of virtual cores allocated to the root.default queue
    Custom sub-queue
    For example: AllocatedVCores#root.test
    Number of virtual cores allocated to the root.test queue
    ReservedVCores
    root
    ReservedVCores#root
    Number of virtual cores reserved in the root queue
    root.default
    ReservedVCores#root.default
    Number of virtual cores reserved in the root.default queue
    Custom sub-queue
    For example: ReservedVCores#root.test
    Number of virtual cores reserved in the root.test queue
    AllocatedContainers
    root
    AllocatedContainers#root
    Number of containers allocated in the root queue
    root.default
    AllocatedContainers#root.default
    Number of containers allocated in the root.default queue
    Custom sub-queue
    For example: AllocatedContainers#root.test
    Number of containers allocated in the root.test queue
    ReservedMB
    root
    ReservedMB#root
    Amount of memory reserved in the root queue
    root.default
    ReservedMB#root.default
    Amount of memory reserved in the root.default queue
    Sub Queue Definition
    e.g., ReservedMB#root.test
    Amount of Reserved Memory in the root.test queue
    AppsKilled
    root
    AppsKilled#root
    Number of terminated tasks in the root queue
    root.default
    AppsKilled#root.default
    Number of terminated tasks in the root.default queue
    Sub Queue Definition
    e.g., AppsKilled#root.test
    Number of terminated tasks in the root.test queue
    AppsFailed
    root
    AppsFailed#root
    Number of failed tasks in the root queue
    root.default
    AppsFailed#root.default
    Number of failed tasks in the root.default queue
    Sub Queue Definition
    For example: AppsFailed#root.test
    Number of failed tasks in the root.test queue
    AppsCompleted
    root
    AppsCompleted#root
    Number of completed tasks in the root queue
    root.default
    AppsCompleted#root.default
    Number of completed tasks in the root.default queue
    Sub Queue Definition
    e.g., AppsCompleted#root.test
    Number of completed tasks in the root.test queue
    AppsSubmitted
    root
    AppsSubmitted#root
    Number of tasks submitted to the root queue
    root.default
    AppsSubmitted#root.default
    Number of tasks submitted to the root.default queue
    Sub Queue Definition
    e.g., AppsSubmitted#root.test
    Number of tasks submitted in the root.test queue
    AvailableVCoresPercentage
    Cluster
    Cluster
    AvailableVCoresPercentage
    MemPendingRatio
    root
    MemPendingRatio#root
    
    Percentage of available memory waiting in the root queue
    
    root.default
    MemPendingRatio#root.default
    
    Percentage of available memory waiting in the root.default queue
    
    Sub Queue Definition
    e.g., MemPendingRatio#root.test
    
    Percentage of available memory waiting in the root.test queue
    
    Trino
    FreeDistributed
    Cluster
    FreeDistributed
    
    Available Distributed memory in the cluster
    
    QueuedQueries
    Cluster
    QueuedQueries
    
    Total number of queries waiting to be executed in the queue
    
    
    
    Contact Us

    Contact our sales team or business advisors to help your business.

    Technical Support

    Open a ticket if you're looking for further assistance. Our Ticket is 7x24 avaliable.

    7x24 Phone Support