Solution to High CPU Utilization

Problem Description
High CPU utilization in TDSQL-C for MySQL clusters can often lead to system anomalies, such as slow responses, inability to obtain connections, and timeout. A large number of timeout retries are often the main culprits of performance "avalanches". High CPU utilization is often caused by abnormal SQL statements, and a large number of lock conflicts, lock waits, or unsubmitted transactions can also lead to high CPU utilization.
When the database performs business queries or modifies statements, the CPU first requests data blocks from the memory:
If the memory has the target data, the CPU will execute the computation task and return the result, which may involve actions requiring high CPU utilization such as sorting.
If the memory does not have the target data, the database will get the data from the disk.
The two data acquisition processes above are called logical read and physical read, respectively. Therefore, poorly performing SQL statements can easily cause the database to generate a lot of logical reads during the execution, resulting in high CPU utilization. They may also make the database generate a lot of physical reads, resulting in high IOPS and I/O latency.
Solutions
DBbrain provides users with three major features to identify and optimize the abnormal SQL statements that cause high CPU utilization:
Anomaly diagnosis: It supports 7 * 24-hour anomaly detection and diagnosis, providing real-time optimization suggestions.
Slow SQL analysis: It analyzes slow SQL statements of the current instance and provides corresponding optimization suggestions.
Audit log analysis: It performs in-depth analysis on SQL statements and provides optimization suggestions based on TencentDB audit data (full SQL).
Method 1 (recommended): Use the "exception diagnosis" feature to troubleshoot database exceptions.
The exception diagnosis feature offers proactive fault localization and optimization, requiring no database operation and maintenance experience. It addresses not only exceptions of high CPU utilization but also nearly all frequent exceptions and failures in both read/write instances and read-only instances in a cluster.
The steps are as shown in the example below:
1. log in to the DBbrain console, select Performance Optimization from the left navigation pane, and then click the Exception Diagnosis tab on the top.
2. Select (enter or search for) an instance ID in the top-left corner to switch to the target instance.
3. On this page, select Real-Time or Historical and specify the time to be queried. If there are any failures within this time frame, an overview of the information can be viewed in the "Diagnosis Prompt" on the right.
4. Click View Details in the "Real-Time/Historical Diagnosis" or the diagnostic items in the Diagnosis Prompt column to enter the diagnosis details page.
Event overview: Includes the diagnosis item name, time range, risk level, duration, and overview.
Description: Includes symptom snapshots and performance trends of the exception event or health check event.
Intelligent Analysis: Analyzes the root cause of the performance exception to help you locate the specific operation.
Expert Suggestion: Provides optimization suggestions, including but not limited to SQL optimization (index and rewrite), resource configuration optimization, and parameter fine-tuning.
5. Click the Optimization Suggestions tab to view the optimization suggestions provided by DBbrain for the failure, such as optimization suggestions for SQL statements in this case.
Method 2. Use the "slow SQL analysis" feature to troubleshoot SQL statements that lead to high CPU utilization
1. Log in to the DBbrain console, select Diagnostic Optimization from the left navigation pane, and click the Slow SQL Analysis tab on top.
2. Select (enter or search for) an instance ID in the top-left corner to switch to the target instance.
3. On the page, select the time period you wish to query. If there are slow SQL statements during this period, the SQL statistics section will display them in a bar chart, showing the times and quantities of slow SQL occurrences.
Click on the bar chart, and the list below will display all the related slow SQL information (aggregated SQL templates), and the right side will display the execution time distribution of SQL during that period.
4. You can identify and filter SQL statement execution data in the SQL statement list in the following way:
4.1 Sort the SQL statements by average duration (or maximum duration). Examine the top SQL statements in terms of duration. We do not recommend you sort the statements by total duration, as the data may be affected by a high number of executions.
4.2 Then, check the numbers of returned rows and scanned rows.
If there is an SQL statement with the same "number of returned rows" and "number of scanned rows", it is very likely that the full table has been queried and returned.
If there are several SQL statements with a large number of scanned rows but no or few returned rows, it means that the system generated a lot of logical and physical reads. If the volume of the data to be queried is too high and memory is insufficient, the request will generate many physical I/O requests and consume lots of I/O resources. Too many logical reads will occupy too many CPU resources, resulting in high CPU utilization.
5. Click an SQL statement to view its details, resource consumption, and optimization suggestions.
Analysis page: You can view the complete SQL template, SQL samples, and optimization suggestions and descriptions. You can optimize SQL based on the expert recommendations provided by DBbrain to improve SQL performance and reduce execution time.
Statistics page: Based on the total execution time proportion, total lock wait time proportion, total rows scanned proportion, and total rows returned proportion in the statistics report, you can analyze the specific causes of the slow SQL occurrence and perform corresponding optimization.
Details page: You can view the user source, IP source, database, and other detailed information for this type of SQL.
﻿

Was this page helpful?

You can also Contact Sales or Submit a Ticket for help.

Yes

tencent cloud

New User Offers

Next-Generation CDN：EdgeOne

Elasticsearch Service Special Offers

Free Tier

Tencent Cloud Startup Program

Special Offers

Lighthouse Special Offers

Cloud Object Storage Special Offers

Featured Products

New Products

Education

Tencent Cloud Online Education Solutions

Gaming

Gaming Solution

Game Media Solutions

Financial Services

Financial Services Solution

Audio & Video

Audio/Video Solution

LVB Recording Solution

Interactive Classroom Solution

Interactive Live Streaming Solution

Audio Chat Social Networking Solution

Real Estate

Tencent Cloud LinkBase(Weiling)

E-commerce

E-commerce retail solutions

Compute

Cloud Virtual Machine

Auto Scaling

Batch Compute

CVM Dedicated Host

Database

TencentDB for MySQL

TencentDB for Redis®

TencentDB for CTSDB

TDSQL for MySQL

Data Transfer Service

TencentDB for MongoDB

TencentDB for PostgreSQL

TencentDB for SQL Server

TencentDB for TcaplusDB

Video Service

Cloud Streaming Services

Video on Demand

Media Processing Service

Cloud Application Rendering

Cloud Contact Center

Game Multimedia Engine

Chat

Real-time Communication

Tencent Effect SDK

AI and Machine Learning

Image Creation Large Model

Face Fusion

eKYC

Optical Character Recognition

Video Creation Large Model

Industry Applications

Tencent HealthCare Omics Platform

Container and Middleware

TDMQ for CKafka

Serverless Cloud Function

Tencent Kubernetes Engine

Tencent Kubernetes Engine for Serverless

Networking

Cloud Load Balancer

Virtual Private Cloud

Direct Connect

Cloud Connect Network

NAT Gateway

VPN Connection

Bandwidth Package

Anycast Internet Acceleration

Elastic Network Interface

Flow Logs

Global Application Acceleration Platform

Security

Captcha