tencent cloud

Feedback

TDSQL-C for MySQL

High CPU Utilization

Last updated: 2025-01-23 17:01:24

Problem Description

When the CPU utilization of TDSQL-C for MySQL exceeds 80%, it may lead to slow business responses, timeouts, and database connection failures.
You can view the CPU usage of TDSQL-C for MySQL on the Monitoring and Alarms page of the TDSQL-C for MySQL console or in the DBbrain console.
Note:
When the CPU utilization gets too high, and both CPU and memory need to be expanded, we recommend adjusting the compute configuration and adjusting the storage space to upgrade instance specifications to ensure normal business operation. Subsequently, you can refer to this document for troubleshooting and optimization.

Failure Risk

If the CPU utilization of TDSQL-C for MySQL remains high for a long time, the overall performance of a database will be severely compromised, and under extreme circumstances, instances may be hung up.
When the high availability (HA) system detects that an instance is hung up, it will trigger a primary-replica switch to ensure high availability of business. During the switch, the business will be unavailable for a short period of time and instances are typically unavailable for less than 60 seconds. If the switch occurs during business peak hours, it will seriously affect the stability and continuity of the business.
To protect your business from being affected by CPU resource shortage, we recommend that you optimize the application or upgrade the CPU resources for the instances with high CPU utilization. A primary-replica switch is accompanied by a disconnection lasting for just seconds; therefore, for persistent connections, your application should have a reconnection mechanism.

Possible Causes

In TDSQL-C for MySQL, CPU resources are mainly occupied by two types of threads: system threads and user threads. Therefore, on Cloud Virtual Machine (CVM) dedicated to TDSQL-C for MySQL, focusing on these two types of threads can address most failure scenarios.

User Threads

User threads are busy, and in most scenarios, this is caused by slow queries. Besides slow queries, factors such as heavy computation workload and high queries per second (QPS) also contribute.
Slow queries Long-lasting computations are executed, such as order by, group by, temporary tables, and join. This type of operations has low query efficiency, causing a single SQL statement to occupy CPU for a long period of time.
Heavy computation workload The large volume of data leads to huge computation workload.
High QPS High QPS stress makes the CPU time fully occupied. For example, if a 4-core server sustains a high QPS of 20,000 to 30,000, the total CPU time can be very long even when the CPU time of a single SQL statement is short.

System Threads

In a production environment, system thread issues occur less frequently. Generally, it is rare for multiple system threads to run at full capacity simultaneously. As long as CVM has at least 4 available CPU cores, the CPU utilization will not be too high. However, there are some bugs that may affect the CPU utilization, as shown in the figure below:


Solutions

As most CPU issues are caused by busy user threads, the following section focuses on the solutions to high CPU utilization caused by user threads.
Slow queries: Use DBbrain to identify and optimize slow queries. For details, see Slow Queries.
Heavy computation workload: The CPU utilization is too high due to the large volume of data being processed. For details, see Heavy Computation Workload.
High QPS: The CPU utilization is too high due to too many access requests. For details, see High QPS.

Directions

Slow Queries

Use DBbrain to identify and optimize the SQL statements which cause high CPU utilization:
**Exception Diagnosis (Recommended)**: This feature performs exception detection and diagnosis around the clock and provides real-time optimization suggestions. For detailed operations, see Use the exception diagnosis feature to troubleshoot database exceptions.
Slow SQL Analysis: This feature analyzes slow SQL statements of the current instance and provides optimization suggestions. For detailed operations, see Use the "slow SQL analysis" feature to troubleshoot SQL statements that lead to high CPU utilization.
Audit Log Analysis: This feature performs in-depth analysis of SQL statements and provides optimization suggestions based on cloud database audit data (full SQL). For detailed operations, see Use the "audit log analysis" feature to troubleshoot SQL statements that cause high CPU utilization.
The default value of TDSQL-C for MySQL slow query time (long_query_time) is 1s. When performance issues occur, if you find no slow queries, it is recommended to reduce the parameter value, and then observe the slow queries during a business cycle and optimize them accordingly. If no slow queries are found after you adjust the parameter value and the CPU utilization is still high, it is recommended to upgrade the CPU configuration to improve the overall database performance.

Heavy Computation Workload

If the data volume is large, even with proper indexing and execution plans, it will lead to high CPU utilization. Combining this with MySQL's one-thread-per-connection characteristic, it does not require too much concurrency to max out the CPU utilization.
Generally, there are two common solutions:
Enable read/write separation. Run queries on a read-only replica node where the business access pressure is low.
Optimize your program by splitting a large SQL query into smaller ones.

High QPS

Upgrade the CPU configuration to improve the overall database performance.
Mount read-only instances to share the load of read-write instances.
Optimize query statements to enhance efficiency.
Contact Us

Contact our sales team or business advisors to help your business.

Technical Support

Open a ticket if you're looking for further assistance. Our Ticket is 7x24 avaliable.

7x24 Phone Support
Hong Kong, China
+852 800 906 020 (Toll Free)
United States
+1 844 606 0804 (Toll Free)
United Kingdom
+44 808 196 4551 (Toll Free)
Canada
+1 888 605 7930 (Toll Free)
Australia
+61 1300 986 386 (Toll Free)
EdgeOne hotline
+852 300 80699
More local hotlines coming soon