tencent cloud

Feedback

Health Check Overview

Last updated: 2024-11-21 10:17:13
    CLB instances determine the availability of real servers through health checks, preventing frontend businesses from being affected by real server exceptions and improving the overall availability of businesses.
    After health check is enabled, regardless of the weights of real servers (including 0), the CLB instance will always perform a health check. You can check the health check status in the Health status column on the instance list or on the listener's bound real server details page.
    If a real server instance is abnormal, the CLB instance will automatically forward new requests to other normal real servers.
    Once the abnormal instance is recovered, it will be used in the CLB service again and will receive new requests.
    If all real servers are found abnormal, requests will be forwarded to all real servers.
    If health check is disabled, the CLB instance will forward traffic to all real servers including those abnormal ones. Therefore, we strongly recommend enabling health check for the CLB instance to automatically check real servers and remove abnormal ones.
    By default, passive health check is enabled (and cannot be disabled) for layer-4 TCP SSL listeners and layer-7 HTTP/HTTPS listeners. The CLB instance forwards traffic to a real server and records the health status of the real server. If forwarding fails, the CLB instance will try to forward traffic to other real servers and adds 1 to the failure count of the failed real server. If the failure count reaches 3, the real server is blocked for 10 seconds. After the blocking period ends, the CLB instance resumes forwarding traffic to the real server and monitoring the health status of the real server.

    Health Check Status

    Description of Health Check Status of a Single Listener

    The health check status of real servers is described as follows:
    Status
    Description
    Whether to Forward Traffic
    Detecting
    The status of a new real server during the period of check interval × healthy threshold. For example, assume the check interval is 2 seconds and the healthy threshold is 3 times, the real server remains in this status for 6 seconds.
    No.
    Healthy
    The real server is normal.
    Yes.
    Abnormal
    The real server is abnormal.
    No.
    For a layer-4 listener or layer-7 URL-based rule, if a CLB instance detects that all real servers are unhealthy, it will forward requests to all real servers.
    Disabled
    Health check is disabled.
    Yes.

    Description of Health Check Status of the List Page

    It is displayed based on the health check conditions of all listeners under the instance:
    Status
    Description
    Normal
    The real servers of all listeners under this instance are normal.
    The health checks of all listeners under this instance are not enabled.
    Abnormal
    If any listener under this instance is abnormal, it will be displayed as abnormal.
    Not configured
    No listener/rule is configured for this instance.
    No listener under this instance is bound to the real server and no listener is abnormal.

    TCP Health Check

    For layer-4 TCP listeners, you can configure TCP health check to obtain the status of real servers through SYN packets, i.e., TCP three-way handshake. Also, to this end, you can customize the request and return content of the protocol.
    
    TCP health check mechanism is as follows:
    1. A CLB instance sends a SYN connection request packet to a real server (private IP address and health check port).
    2. After receiving the SYN request packet, the real server will return a SYN-ACK response packet if the port is listening normally.
    3. If the CLB instance receives the returned SYN-ACK response packet within the response timeout period, it indicates that the real server is normal and the health check is successful. Then the CLB instance will send the real server a TCP Reset (RST) packet to cut the TCP connection.
    4. If the CLB instance does not receive the returned SYN-ACK response packet within the response timeout period, it indicates that the real server is abnormal and the health check failed. Then the CLB instance will send the real server a TCP Reset (RST) packet to cut the TCP connection.

    UDP Health Check

    For layer-4 UDP listeners, you can configure UDP health check to obtain the status of a real server by running the Ping command and sending UDP detection packets to the health check port. Also, to this end, you can customize the request and return content of the protocol.
    
    UDP health check mechanism is as follows:
    1. A CLB instance sends a Ping command to the private IP address of a real server.
    2. Then the CLB instance sends a UDP detection packet to the real server (private IP address and health check port).
    3. If the Ping command succeeds and the real server does not return the error port XX unreachable within the response timeout period, it indicates that the real server is normal and the health check is successful.
    4. If the Ping command fails or the real server returns the error port XX unreachable within the response timeout period, it indicates that the real server is abnormal and the health check failed.
    Note:
    1. UDP health checks are based on ICMP, therefore, real servers need to be allowed to reply ICMP packets (i.e., Ping command is supported) and ICMP "port unreachable" packets (i.e., the port can be detected).
    2. If a Linux server is used as a real server, the speed of the server to send ICMP packets will be limited during high concurrency as the Linux server has a mechanism of defending itself from ICMP attacks. In this case, although the real server is abnormal, it cannot return the error port XX unreachable to the CLB instance. Then the CLB instance will determine that the health check is successful, so the actual status of the real server cannot be returned. Solution: You can configure the UDP health check with custom input and output strings. So in a health check, the custom input string will be sent to the real server, and the result will be determined as successful only after the CLB instance receives the custom response string. This method is based on the real server, which needs to process the health check input string and return the custom output string.

    HTTP Health Check

    For layer-4 TCP listeners and layer-7 HTTP/HTTPS listeners, you can configure HTTP health check to obtain the status of real servers by sending HTTP requests.
    
    HTTP health check mechanism is as follows:
    1. According to the health check configuration, a CLB instance can send HTTP requests (with the target domain name specified) to (the private IP address, health check port, and check path of) a real server.
    2. After receiving the request, the real server will return the corresponding HTTP status code.
    3. If the CLB instance receives the returned HTTP status code within the response timeout period and the HTTP status code matches the set one, it indicates that the health check is successful, otherwise, failed.
    4. If the CLB instance does not receive the response from the real server within the response timeout period, it indicates that the health check failed.
    Note:
    For layer-7 HTTPS listeners, if HTTP is selected as the backend protocol of the HTTPS listener's forwarding rules, HTTP health check will be conducted; if HTTPS is selected, HTTPS health check will be conducted. HTTPS health checks are basically the same as HTTP health checks. The difference is that in HTTPS health checks, HTTPS requests are sent and the status of a real server is determined based on the returned HTTPS status code.

    Health Check Time Window

    CLB health check mechanism improves business availability, but frequent health check failures can cause unnecessary server switches, compromising system availability. Therefore, health check status can be switched between healthy and abnormal only if the results are being the same in a health check time window for several times. The health check time window is based on the factors below:
    Health Check Factor
    Description
    Default Value
    Response timeout
    Maximum response timeout period for a health check.
    If a real server fails to respond within the timeout period, the real server is considered as abnormal.
    Value range: 2-60 seconds.
    2 seconds
    Check interval
    Interval between two health checks.
    Value range: 5-300 seconds.
    5 seconds
    Unhealthy threshold
    If a real server failed the health check for n (a customizable value) consecutive times, the real server is considered unhealthy, and Abnormal is displayed in the console.
    Value range of n: 2-10.
    3 times
    Healthy threshold
    If a real server passed the health check for n (a customizable value) consecutive times, the real server is considered healthy, and Healthy is displayed in the console.
    Value range of n: 2-10.
    3 times
    The calculations of layer-4 health check time window are as follows:
    Note:
    Layer-4 health check, namely the TCP health check or UDP health check, the time interval between two checks is the set value, no matter the result is successful or whether the response times out.
    Time window of a health check with a failed result = Check interval × (Unhealthy threshold - 1) In the example below, the health check response timeout period is 2 seconds, check interval is 5 seconds, and the unhealthy threshold is 3 times, so the time window of a health check with a failed result = 5 x (3 - 1) = 10 seconds.
    
    Time window of a health check with a successful result = Check interval × (Healthy threshold - 1) In the example below, the period of a successful health check response is 1 second, check interval is 5 seconds, and the healthy threshold is 3 times, so the time window of a health check with a successful result = 5 x (3 - 1) = 10 seconds.
    
    The calculations of layer-7 health check time window are as follows:
    Time window of a health check with a failed result = Response timeout period × Unhealthy threshold + Check interval × (Unhealthy threshold - 1) In the example below, the health check response timeout period is 2 seconds, check interval is 5 seconds, and the unhealthy threshold is 3 times, so the time window of a health check with a failed result = 2 x 3 + 5 x (3-1) = 16 seconds.
    
    Time window of a health check with a successful result = Period of a successful health check response × Healthy threshold + Check interval × (Healthy threshold -1) In the example below, the period of a successful health check response is 1 second, check interval is 5 seconds, and the healthy threshold is 3 times, so the time window of a health check with a successful result = 1 x 3 + 5 x (3 - 1) = 13 seconds.
    

    Health Check Identifiers

    After CLB health checks start, the real server will receive health check requests in addition to normal business requests. A health check request may have the following properties:
    The health check source IP address is the CLB VIP or the 100.64.0.0/10 IP range.
    A health check request from layer-4 listeners (TCP, UDP, and TCP SSL) will be marked with "HEALTH CHECK".
    For a health check request from layer-7 listeners (HTTP and HTTPS), the value of the User-Agent header is clb-healthcheck.
    Note:
    For a health check request from private network classic CLB instances, the health check source IP address is the 169.254.128.0/17 IP range.
    For a health check request from classic network CLB instances, the health check source IP address is the physical IP address.

    References

    Contact Us

    Contact our sales team or business advisors to help your business.

    Technical Support

    Open a ticket if you're looking for further assistance. Our Ticket is 7x24 avaliable.

    7x24 Phone Support