Tencent Cloud

Recent Pages

Overview

Last updated: 2024-09-26 15:34:19

Tencent Smart Advisor-Chaotic Fault Generator (CFG) provides efficient, convenient, safe, and reliable fault injection services. In addition, it also provides industry templates, monitoring guardrails, and other core functions, and is committed to helping users promptly discover business disaster recovery risks and verify the effectiveness of high-availability plans, thereby improving system availability and resilience.
Basic Concepts
Before use of the CFG, understanding the relevant concepts will help you get started with product operations faster.
Concept
Description
Example
Chaos engineering
Chaos engineering is a discipline that conducts experiments on distributed systems. It updates the understanding of the system through practice, thereby understanding and discovering the unknown weaknesses of the system. The purpose is to build the ability and confidence of the system to resist out-of-control conditions in the production environment.
-
Experiment
The process of verifying and improving system availability by injecting specified faults into specified locations of the system and observing the experimental results.
-
Action
It refers to the atomic fault actions injected into the system during the experiment, including various fault injection scenes of IaaS, PaaS, and SaaS. In an experiment, users can freely combine and orchestrate multiple experiment actions. An action group is a collection of actions.
High CPU usage, CVM shutdown, and database primary/secondary switch
Object
The instance object that the action acts on.
CVM and MySQL
Template
Save valuable and frequently used experiments and scenes as experiment templates for quick reuse later. The templates include basic experiment information and action orchestration solution, and you only need to determine the experiment object for subsequent use.
Cross-AZ disaster recovery experiment template and network fault template
Monitoring metrics
To determine whether the system is running stably and whether the fault injection is successful, the system steady-state metrics can be configured in advance to observe changes in steady-state metrics during experiments, perceiving system changes in real time.
Disk usage (%)
Guardrail policy
Configure alarm metrics and trigger policies. When the alarm metrics reach the trigger threshold, the system can automatically stop the experiment and roll back the action to control the impact scope of the experiment.
If the disk usage (%) reaches 90%, the experiment will automatically stop.
﻿
﻿

Contact Us

Contact our sales team or business advisors to help your business.

Technical Support

Open a ticket if you're looking for further assistance. Our Ticket is 7x24 avaliable.

7x24 Phone Support

tencent cloud

Recent Pages

Overview

Basic Concepts

Was this page helpful?

Was this page helpful?

Concept	Description	Example
Chaos engineering	Chaos engineering is a discipline that conducts experiments on distributed systems. It updates the understanding of the system through practice, thereby understanding and discovering the unknown weaknesses of the system. The purpose is to build the ability and confidence of the system to resist out-of-control conditions in the production environment.	-
Experiment	The process of verifying and improving system availability by injecting specified faults into specified locations of the system and observing the experimental results.	-
Action	It refers to the atomic fault actions injected into the system during the experiment, including various fault injection scenes of IaaS, PaaS, and SaaS. In an experiment, users can freely combine and orchestrate multiple experiment actions. An action group is a collection of actions.	High CPU usage, CVM shutdown, and database primary/secondary switch
Object	The instance object that the action acts on.	CVM and MySQL
Template	Save valuable and frequently used experiments and scenes as experiment templates for quick reuse later. The templates include basic experiment information and action orchestration solution, and you only need to determine the experiment object for subsequent use.	Cross-AZ disaster recovery experiment template and network fault template
Monitoring metrics	To determine whether the system is running stably and whether the fault injection is successful, the system steady-state metrics can be configured in advance to observe changes in steady-state metrics during experiments, perceiving system changes in real time.	Disk usage (%)
Guardrail policy	Configure alarm metrics and trigger policies. When the alarm metrics reach the trigger threshold, the system can automatically stop the experiment and roll back the action to control the impact scope of the experiment.	If the disk usage (%) reaches 90%, the experiment will automatically stop.

tencent cloud

Sign Up

Log in

Recent Pages

Overview

Basic Concepts

﻿

Was this page helpful?

Was this page helpful?