Background
CVM intra-host network corruption fault is one of the common issues, that hardware faults, improper network configuration, network congestion, and other issues may cause. It will result in CVM being unable to respond to user requests and impact the normal operation of businesses. For businesses that rely on high availability and low delay, network corruption will cause great inconvenience and loss to users.
To improve network reliability and stability in CVM, network corruption fault experiments are required. Through the experiments, the capability of the system for normal operation in the situation of network damage can be verified and issues in network corruption fault scenes can be revealed in advance so that system architecture can be optimized and contingency plans can be prepared.
Experiment Implementation
Step 1: Experiment Preparation
Prepare several CVM instances that are available for the experiment.
Step 2: Experiment Orchestration
1. Check network status before fault injection. Send messages to the target machine through ping commands, and wait for a response from the target machine to check network connectivity. If no response is received from the target machine or there is a high packet loss rate, there may be an issue of network corruption.
3. Click Skip and create a blank experiment. Fill in the experiment information, and add a target CVM instance.
4. Click Add Now, select Network Resource, click Intra-host network corruption, and click Next.
5. Configure fault action parameters, and click Confirm.
6. After action parameter configuration, click Next. Configure Guardrail Policy and Monitoring Metrics considering actual situations, and click Submit to complete experiment creation.
Step 3: Experiment Execution
1. Go to experiment details, and click Go to the action group for execution.
2. Click Execute to start an experiment.
3. Click the Action Card to check the details for the action execution results.
4. Check host network status after fault injection. It can be seen when the target machine is pinged again that the returned network packet is partially damaged. 5. Execute a recovery action, and check details of the recovery action.
6. Check recovery result. When the target machine is pinged again, it can be seen that normal network transmission has recovered and the fault has been cleared.
Was this page helpful?