Background
Network Address Translation (NAT) gateway is a crucial network service that performs IP address translation between a virtual private cloud (such as a VPC) and a public network (such as the Internet). A NAT gateway allows resources in a virtual private cloud to access the public network through a single public IP address while ensuring the security and isolation of the virtual private cloud. When a NAT gateway encounters a fault, it may prevent resources in the virtual private cloud from accessing the public network, impacting the normal operation of business. NAT gateway faults can be caused by misconfigurations, network issues, and hardware faults.
To enhance the reliability and stability of NAT gateways, it is necessary to conduct NAT gateway fault experiments. Through experiments, we can verify whether the system can operate normally in NAT gateway fault scenarios and expose potential issues in advance, allowing for system architecture optimization and emergency planning.
Experiment Execution
Step 1: Experiment Preparation
Log in to NAT Gateway and create a gateway service. If there is already a gateway service available for the experiment, proceed directly to create the experiment. Step 2: Create an Experiment
2. Click Skip and create a blank experiment, and fill in the basic details.
3. Select Network as the instance type and NAT Gateway as the instance object, then Add Instance.
4. Click Add Now to add fault action.
5. Select the fault action.
6. Set action parameters and click OK.
7. After action parameter configuration, click Next. Configure Guardrail Policy and Monitoring Metrics considering actual situations, click Submit to complete experiment creation.
Step 3: Execute the Experiment
1. View the performance metrics of the NAT Gateway instance before executing the fault.
2. Go to experiment details, and click Go to the action group for execution.
3. Click Execute to start an experiment.
4. View the details of the action execution results.
5. View the execution logs to confirm it has been executed successfully.
6. After executing the fault, view the NAT Gateway instance’s performance metrics again. You can see that the maximum concurrent connections have been updated to the value specified in the action execution parameters, indicating that the fault injection was successful.
7. Execute the fault recovery actions, view the execution logs, and confirm that the recovery actions were successful.
8. After fault recovery, view the NAT Gateway instance’s performance metrics once more. You can see that the maximum concurrent connections have been recovered to the initial value, indicating that the fault recovery was successful.
Was this page helpful?