Background
Redis cluster is an important component for storing hot data of businesses. To ensure business availability, Gossip protocol between nodes in the cluster is used to determine node status. Heartbeat timeout (cluster-node-timeout) is 15s by default. If the fault node is the primary node, Tencent Cloud Redis will apply a failover mechanism and select a new primary node from the secondary node.
Based on the above features, Tencent Smart Advisor-Chaotic Fault Generator provides a manual method to skip node fault stage and directly simulate fault actions for HA policy. You can simulate the impact on businesses in the short period when failover occurs in Redis cluster by manual fault method.
Experiment Implementation
Experiment Preparation
Prepare a multi-node cross-availability zone Redis instance.
Experiment Steps
Step 1: Create an Experiment
2. Click Skip and create a blank experiment .
3. Fill in basic information. For Experiment Resource Object, select TencentDB for Redis memory edition under Cloud Resource Type, and add an instance.
Step 2: Add Actions
1. Click Add Now to add a fault action. Select Redis primary-secondary switch in Redis for fault action.
2. In action parameters setting, flexibly select primary/replica switch mode based on a simulated disaster recovery scene:
Prefer switching within the same availability zone
Simulate the real HA policy scene of Tencent Cloud Redis when the primary node is faulty: the latest data node is uplifted to the primary node; when data is identical, priority is given to other nodes in the same availability zone.
Prefer switching within the same availability zone
When an entire availability zone is faulty, nodes in another availability zone will be raised for the primary scene.
Step 3: Execute Experiment Actions
Go to Experiment Details. In the Experiment Action Group, click Execute to start executing an experiment.
Check Results
Take cross-availability zone mode as an example, check whether the availability zone state has changed before and after fault injection.
Was this page helpful?