CKafka Broker High CPU Load

Tencent Smart Advisor-Chaotic Fault Generator

Product Overview

Overview

Strengths

Use Cases

Purchase Guide

Purchase Instructions

Quick Start

Quick Start with the Console

Quick Start with API

Operation Guide

Template Library

Using Industry Template Library

Creating a Template Library

Experiments

Pre-Checking Environment for Chaos Engineering Experiments

Creating an Experiment

Exporting Experiment Reports

Fault Action

Editing Action Parameters

Concurrent Injection of Multiple Action Groups in Experiments

Guide to Viewing Action Execution Duration Data

Guardrail Monitoring

Using Guardrails

Tag

Managing Permissions with Tags

Agent Management

Fault Action Library

Compute

JVM Process CPU at Full Load

Cross-AZ Experiment in CVM

CVM DNS Unavailability Experiment

CVM Domain Name Parsing Tampering Experiment

CVM System Time Skew

CVM Disk IO Hang Fault Experiment

CVM Memory OOM and Disk IO Load

Experiments on CVM Intra-host Network Disorder

CVM Kernel Faults

Experiments on CVM Intra-host Network Latency

High Utilization of CVM Resources (CPU, Memory, and Disk)

Experiments on CVM Intra-host Network Corruption

Experiments on CVM Intra-host Network Duplication

Experiments on CVM Intra-host Network Occupation

CVM Network Interruption

CVM Ping Unreachable

Cloud API Ban in CVM

Database

Primary-secondary Switch in TencentDB for PostgreSQL

MySQL Instance Overall Unavailable

Primary-secondary Switch in MySQL

Setting Maximum Number of Connections in MySQL

Primary Node Fault Experiment on TencentDB for MySQL

TencentDB for MySQL Read-only Instance Group Unavailable

Primary-secondary Switch in TDSQL for MySQM

Primary-secondary Switch in MariaDB

Primary-secondary Switch in TDSQL-C

Practice of TencentDB for Redis Proxy Node Faults

TencentDB for Redis Primary Node Fault

Primary-secondary Nodes in TencentDB for Redis Instance Unavailable

Simulating Primary-secondary Switch in Redis

Simulating MongoDB Storage Node Fault

Simulating Self-Built MySQL Crash Through Network Blocking

Restarting TencentDB for MySQL

Primary-secondary Switch in SQL Server

Network

VPC Subnet Network Isolation

NAT Gateway Fault Experiment Case

Container

Simulating Container Resource Network Faults

Experiment on Container Resource Pod Operation Faults

Experiment on Container Resource Node Faults

Experiment on Container Resource Application Process Faults

Standard Cluster and Serverless Cluster Super Node Faults

Serverless Pod Fault Experiment Case

Cluster Node Resource (CPU, Memory, Disk) Stress Test Faults

Serverless Pod Virtual Node Shutdown Faults

Simulating Serverless Cluster Pod Network Faults

High Cluster Pod Resource (CPU, Memory, and Disk) Utilization Rate

Cluster Pod TencentCloud API Ban

Big Data

Elasticsearch Service Node Down

Cloud Load Balancer

CLB Stop Fault

Message Queue

CKafka Broker High Disk IO Load

CKafka Broker High CPU Load

CKafka Broker Down

TDMQ for RabbitMQ Broker Down

Direct Connect

Simulating DC Tunnel Disconnection Faults

Custom Actions

Expanding Fault Injection Actions with Custom Scripts

Performing Single-Core CPU Stress Test with Custom Actions

Implementing CPU Accumulation Faults with Custom Actions

Implementing CRS Connection Count Increase with Custom Actions

Injecting PowerShell Scripts for Windows Systems

Cloud Streaming Services (CSS)

Stream Push Interruption

Stream Push Disabled

Primary-Secondary Stream Switch

Primary-secondary Stream Single Path Interruption

Permission Management Guide

Overview

Authorization Policy Syntax

Authorizable Resource Types

Service Authorization and Role Permissions

Sub-users and Authorization

API Documentation

FAQs

Product Feature Issues

Action Execution Issues

Agent FAQ

Related Protocol

DATA PRIVACY AND SECURITY AGREEMENT MODULE CHAOTIC FAULT GENERATOR

DocumentationTencent Smart Advisor-Chaotic Fault GeneratorFault Action LibraryMessage QueueCKafka Broker High CPU Load

CKafka Broker High CPU Load

Download PDF

Last updated: 2024-09-26 15:49:18

CKafka Broker High CPU Load

Last updated: 2024-09-26 15:49:18

Download PDF

Background
The message middleware plays a crucial role in distributed systems. However, in actual production environments, various factors can lead to a high CPU load on broker nodes. Here are some common scenarios:
High message throughput: If a specific topic or partition within the CKafka cluster receives a very high message throughput, the broker node needs to process a large number of read and write operations.
Large number of consumer groups: If a large number of consumer groups subscribe to the same topic or partition, the broker should distribute and manage messages for each consumer group.
Copy and synchronize: If the CKafka cluster has data copy and synchronization features enabled, the broker needs to process read and write operations for copy tasks and synchronization communication with other brokers.
Compression and decompression: If messages are stored compressed, the broker needs to perform compression and decompression operations, which may consume significant CPU resources.
Index and log compression: CKafka uses indexes to speed up message searches. If the index is too large or needs compression, the broker must perform index maintenance and compression operations.
High concurrent connections: If there are a large number of producers and consumers connected to the broker, the broker needs to process the establishment and maintenance of these connections, increasing CPU load.
When a broker node experiences high CPU load, several issues may arise:
Increased delay: High CPU load may slow down message processing, thereby increasing message transmission and processing delays. This can affect the speed at which consumers read messages from CKafka, potentially preventing them from obtaining the latest messages in time.
Decreased throughput: Due to the CPU resources being occupied by high-load tasks, the CKafka Broker may be unable to process more messages, leading to an overall decrease in throughput. This will affect the speed at which producers send messages and consumers consume messages.
Network congestion: High CPU load may cause the CKafka Broker to be unable to process network requests in time, leading to network congestion. This will affect data copy and synchronization with other brokers, potentially causing increased data copy delays or untimely synchronization.
 Increased response time: Due to the high CPU load, the CKafka Broker may be unable to respond to client requests in time, leading to increased client wait time. This will affect the performance and response time of applications interacting with the CKafka cluster.
To address these issues, the CFG provides CKafka Broker high CPU load experiment actions. These experiment actions test the business system's ability to process unexpected delays and recovery when faced with high CPU loads on CKafka Brokers, thereby improving business security and stability.
Must-Knows
Instance type: This action is only open for fault injection on instances of the CKafka Pro Edition type. CKafka Standard Edition instances do not support experiments yet.
Instance status: It is recommended that the instance used for the experiment has real message production and consumption traffic, with the number of topic partitions greater than 3, to better observe the impact of the fault on the business.(non-mandatory item)
Experiment Preparation
Prepare a CKafka Pro Edition instance for the experiment.
Step 1: Create an experiment
1. Log in to the Tencent Smart Advisor > Chaotic Fault Generator Console.
2. In the left sidebar, select Experiment Management page, and click Create a New Experiment.
3. Click Skip and create a blank experiment.
4. After filling in the basic information, enter the Experiment Object Configuration. Select the Middleware > CKafka object type, and click Add Instance. After clicking Add Instance, all CKafka instances in the target region will be listed. You can filter instances based on Instance ID, VPC ID, Subnet ID, and Tags.
5. After you select the target instance, click Add Now to add the experiment action.
6. Select the experiment action Broker-CPU High Load, and then click Next.
7. Set action parameters. In this document, the CPU load rate 80% is selected and the duration is set to 200 s, then click Confirm.
8. Click Next to enter the Global Configuration. For global configuration, see Quick Start.
9. After confirmation, click Submit.
10. Click Experiment Details to start the experiment.
Step 2: Execute the experiment
1. Observe the instance monitoring data before the experiment, and you can focus on the monitoring metrics in Advanced Monitoring. You can view this on the CKafka Console.
2. As the experiment is manually executed, fault actions must be executed manually. Click Execute in Action Card to start fault injection.
3. While fault injection is in progress, you can click the links in the logs to observe in Advanced Monitoring.
4. Observe that the CPU utilization reaches the set value.
﻿

Was this page helpful?

You can also Contact Sales or Submit a Ticket for help.

Yes

tencent cloud

New User Offers

Next-Generation CDN：EdgeOne

Elasticsearch Service free trial

Free Tier

Tencent Cloud Startup Program

Special Offers

Lighthouse Special Offers

Cloud Object Storage Special Offers

Featured Products

New Products

Education

Tencent Cloud Online Education Solutions

Gaming

Gaming Solution

Game Media Solutions

E-commerce

E-commerce retail solutions

Audio & Video

Audio/Video Solution

LVB Recording Solution

Interactive Classroom Solution

Interactive Live Streaming Solution

Audio Chat Social Networking Solution

Financial Services

Financial Services Solution

Compute

Cloud Virtual Machine

Auto Scaling

Batch Compute

CVM Dedicated Host

Database

TencentDB for MySQL

TencentDB for Redis®

TencentDB for CTSDB

TDSQL for MySQL

Data Transfer Service

TencentDB for MongoDB

TencentDB for PostgreSQL

TencentDB for SQL Server

Video Service

Cloud Streaming Services

Video on Demand

Media Processing Service

Cloud Application Rendering

Cloud Contact Center

Game Multimedia Engine

Chat

Real-time Communication

Tencent Effect SDK

AI and Machine Learning

Image Creation Large Model

Face Fusion

eKYC

Optical Character Recognition

Video Creation Large Model

Industry Applications

Tencent HealthCare Omics Platform

Container and Middleware

TDMQ for CKafka

Serverless Cloud Function

Tencent Kubernetes Engine

Tencent Kubernetes Engine for Serverless

Networking

Cloud Load Balancer

Virtual Private Cloud

Direct Connect

Cloud Connect Network

NAT Gateway

VPN Connection

Bandwidth Package

Anycast Internet Acceleration

Elastic Network Interface

Flow Logs

Global Application Acceleration Platform

Security

Captcha

Cloud Workload Protection Platform

Data Security Governance Center

Key Management Service