tencent cloud

Feedback

Primary-secondary Switch in MariaDB

Last updated: 2024-09-26 15:47:38

    Background

    In a database system, primary-secondary switch is one of the important means to ensure the high availability of the database. Primary-secondary switch can ensure that when there is a fault in primary node, secondary node can quickly take over work of the primary node so that system continuity and stability are ensured. Tencent Smart Advisor-Chaotic Fault Generator provides a fault action for simulating a primary-secondary switch scene in TencentDB for MariaDB. Overall high availability of your business MariaDB can be verified through the fault action.
    Primary-secondary switch experiments are intended to help developers in system tests and experiments in a more complex and realistic environment so that possible problems and risks can be identified. Through experiments and tests in chaos engineering, developers can have a more comprehensive understanding of system operating modes and performance characteristics so that they can develop countermeasures and policies for different fault scenes to improving system stability and availability.
    Note:
    Primary-secondary switch will switch the primary node of the instance to another secondary node. It can be used to simulate a switch upon occurrence of an availability zone or node fault. Disconnection may occur during the switch.
    This fault can be injected in the following ways:
    Prioritize injection in the same availability zone: Secondary node in the same availability zone will be selected as the target node for switch. If there is no node satisfying the condition, switchable secondary node will be searched in other availability zones.
    Prioritize cross-availability zone injection: Cross-availability zone secondary node across will be preferentially selected as the target node for switch. If there is no node satisfying the condition, switchable secondary node will be searched in the same availability zone.
    Note:
    After initiating a task and a successful switch, you can observe the changes in System Monitoring-Primary-secondary Switch, and 1s momentary disconnection will occur. Make sure that there is a database reconnection mechanism for your business.
    The system prioritizes data consistency, indicating that a switch may fail. Retry after at least 5 minutes as required.

    Experiment Implementation

    Step 1: Experiment Preparation

    Prepare a TencentDB for MariaDB instance which has one primary and two secondary.

    Step 2: Experiment Orchestration

    1. Log in to the Tencent Smart Advisor > Chaos Engineering Console, go to the Experiment Management page, and click Create a New Experiment.
    2. Click Skip and create a blank experiment at the lower left quarter.
    3. Fill in experiment information, select Object Type Database > MariaDB, and click Add Instance to add instances for the experiment.
    4. Add an instance and click Add Now to add a fault action.
    5. Select Primary-secondary switch failure.
    6. Click Next, configure fault action parameter switch mode to Prefer switching across availability zones, and click Confirm .
    7. After all configurations are confirmed, click Next and then click submit to complete experiment creation.

    Step 3: Experiment Execution

    During fault execution, the primary-secondary switch in MariaDB instance will be triggered. Change in primary-secondary node architecture in the instance can be observed through TencentDB for MariaDB console.
    Note:
    You can go to monitoring and alarm module of the instance corresponding to TencentDB for MariaDB console, and observe instance node ID and corresponding role. M is the primary node and S is the secondary node.
    1. Go to Experiment Details, click Execute on Fault Action Card to start executing an experiment.
    2. During fault injection, node changes can be observed through TencentDB for MariaDB console.
    3. After a successful fault injection, click Action Card to check the details of the execution. You can see execution logs, and the primary node has been switched from the primary availability zone to secondary availability zone.
    4. Go to TencentDB for MariaDB console to check node changes, and you will find that the switch from primary node has occurred.
    5. Executing fault recovery action will trigger another primary-secondary switch and recover instance deployment to the state before the fault, that is, the primary node will be switched back to the original primary availability zone.
    6. After a successful recovery, check execution logs, and primary node has been switched back to the original primary availability zone.
    7. Go to TencentDB for MariaDB Control Panel to check instance information.
    
    Contact Us

    Contact our sales team or business advisors to help your business.

    Technical Support

    Open a ticket if you're looking for further assistance. Our Ticket is 7x24 avaliable.

    7x24 Phone Support