TKE Event Log Analysis

Cloud Log Service

Release Notes and Announcements

Release Notes

Announcements

Upgrading Existing LogListener Instances

Product Introduction

Limits

Log Collection

Shipping and Consumption

Concepts

Purchase Guide

Pay-as-You-Go

Pay-As-You-Go Examples

Billing

Cleaning up CLS resources

Cost Optimization

FAQs

Getting Started

Getting Started Guide

Quickly Trying out CLS with Demo

Operation Guide

Resource Management

Logset

Log Topic

Metric topic

Topic Usage Monitoring

Machine Group Management

Permission Management

Sub-Account Authorization

Authorizable Resource Types

Access Policy Templates

Log Collection

Collection Overview

Collection by LogListener

LogListener Use Process

LogListener Installation and Deployment

LogListener Installation Guide(Linux)

Batch Deploying LogListener in CVM and Lighthouse

Installing LogListener in Self-built Kubernetes Cluster

LogListener Upgrade Guide

LogListener Service Logs

Importing LogListener Collection Configuration

LogListener Updates

Collecting Text Log

Full Text in a Single Line

Full Text in Multi Lines

Full Regular Expression (Single-Line)

Full Regular Expression (Multi-Line)

JSON Format

Separator Format

Combined Parsing Format

Configuring the Time Format

Collecting Logs in Self-Built Kubernetes Cluster

Configuring Log Collection in Self-Built Kubernetes Cluster in the Console

Configuring Log Collection in Self-Built Kubernetes Cluster via CRD

Collecting Syslog

Uploading Log over Kafka

Uploading Logs via Anonymous Write

Uploading Logs via Logback Appender

Uploading Logs via Log4j Appender

Uploading Log via SDK

Uploading Log via API

Importing Data

Importing COS Data

Tencent Cloud Service Log Access

Metric Collection

Metrics Reporting

Log Storage

Storage Class Overview

Data Encryption

Metric Storage

Metric Storage Overview

Compatible with Prometheus API

Metric Pre-aggregation

Search and Analysis (Log Topic)

Search and Analysis (Metric Topic)

Syntax Rules

Dashboard

Data Processing documents

Shipping and Consumption

Shipping Overview

CLS Service Role Authorization

Shipping to COS

JSON Shipping

Raw Log Shipping

Shipping Task Management

Shipping to CKafka

Creating Shipping Task

Shipping to ES

Dumping to ES Through SCF

Log Shipping

Consumption over Kafka

Customized Consumption

Metric Topic Shipping

Monitoring Alarm

Monitoring Alarm Overview

Managing Alarm Policies

Configuring Alarm Policies

Trigger Condition Expression

Channels to Receive Alarm Notifications

Receiving Alarm Notifications via SMS and Phone

Custom Callback APIs

Managing Notification Groups

Alarm Notification Variable

Viewing Alarm Records

Alarm Silence

Historical Documentation

Operation guide of earlier LogListener versions

Troubleshooting earlier LogListener versions

Practical Tutorial

Log Collection

Collecting Windows Logs to CLS

Collect/Query Host File Logs

Search and Analysis

CLB Access Log Analysis

NGINX Access Log Analysis

CDN Access Log Analysis

COS Access Log Analysis

CCN Flow Log Analysis

TKE Event Log Analysis

TKE Audit Log Analysis

Dashboard

Migrating the ES data source of Grafana to the CLS data source

Monitoring Alarm

Setting Alarm Trigger Conditions by Time Period

Setting Interval-Valued Comparison and Periodically-Valued Comparison as Alarm Trigger Conditions

Shipping and Consumption

Consumption of CLS Log with Flink

Using DLC (Hive) to Analyze CLS Log

Developer Guide

Embedding CLS Console

CLS Connection to Grafana

API Documentation

Making API Requests

Topic Management APIs

Log Set Management APIs

Index APIs

Topic Partition APIs

Machine Group APIs

DescribeMachineGroups

DescribeMachines

AddMachineGroupInfo

DeleteMachineGroupInfo

Collection Configuration APIs

ApplyConfigToMachineGroup

DeleteConfigFromMachineGroup

DescribeConfigMachineGroups

DescribeMachineGroupConfigs

Log APIs

Metric APIs

Alarm Policy APIs

DescribeAlertRecordHistory

Data Processing APIs

DescribeDataTransformInfo

ModifyDataTransform

Kafka Protocol Consumption APIs

CloseKafkaConsumer

ModifyKafkaConsumer

OpenKafkaConsumer

DescribeKafkaConsumer

CKafka Shipping Task APIs

Kafka Data Subscription APIs

DescribeKafkaRecharges

CheckRechargeKafkaServer

COS Shipping Task APIs

SCF Delivery Task APIs

CreateDeliverCloudFunction

Scheduled SQL Analysis APIs

DeleteScheduledSql

DescribeScheduledSqlInfo

CreateScheduledSql

ModifyScheduledSql

COS Data Import Task APIs

SearchCosRechargeInfo

FAQs

Health Check

Index Configuration

Log Upload

Collection

Machine Group Exception

LogListener FAQs

LogListener Installation Exception

Container Log Collection

Self-Built Kubernetes Log Collection Troubleshooting Guide

Log Search

Log Search Failure

Search Analysis Error

Others

CLS Service Level Agreement

CLS Policy

Data Processing And Security Agreement

Glossary

DocumentationCloud Log ServicePractical TutorialSearch and AnalysisTKE Event Log Analysis

TKE Event Log Analysis

Download PDF

Last updated: 2024-01-20 17:28:40

TKE Event Log Analysis

Last updated: 2024-01-20 17:28:40

Download PDF

Overview
Situations in the cluster emerge one after another and are unpredictable, such as abnormal node status and pod restarts. If these situations cannot be perceived in the first place, users will miss the best time to deal with them. It's often too late to find out when the problem worsens and affects businesses.
Event logs record comprehensive information about cluster status changes, helping users find and troubleshoot problems in the first place.
Event Log Definition
An event log is one of many resource objects in Kubernetes and is usually used to record status changes within a cluster, ranging from cluster node exceptions to pod startup and scheduling success. You can use the 'kubectl describe' command to view the event log information of resources.
Event Log Fields
﻿
﻿
﻿
Level (type): Currently only the Normal and Warning levels are supported. If necessary, you can customize a level.
Resource type/object (involvedobject): Objects involved in the event, such as Pod, Deployment, and Node.
Event source (source): Component that reports the event, such as Scheduler and Kubelet.
Content (reason): Brief description of the current event. Generally, an enumerated value is used. This field is used within the program.
Detailed description (message): Detailed description of the current event.
Number of occurrences (count): Number of times the event occurs
Using Event Logs for Troubleshooting
CLS provides a one-stop service for Kubernetes event logs, including collection, storage, search, and analysis capabilities. You only need to enable the cluster event log feature with a few clicks to obtain a visual event log analysis dashboard out of the box. With visual charts, you can easily solve most common Ops problems via the console.
Prerequisites
You have purchased the Tencent Kubernetes Engine (TKE) service and enabled the cluster event log feature. For more information, see Event Storage.
Scenario 1: An exception occurred on a node, and you need to locate the cause
1. Log in to the TKE console.
2. On the left sidebar, click Log Management > Event Log.
3. On the Event Search page, click Event Overview tab and enter the exception node name as the filter item.
The query result is displayed.
Check the exception event trend and top exception events.
Starting from 2020-11-25, the node 172.16.18.13 was exceptional due to insufficient disk space. Then Kubelet began to drain pods on the node to repossess the node's disk space.
Scenario 2: A node triggered expansion, and you need to backtrack the expansion process to determine the cause
For clusters with node poolauto scaling enabled, the Cluster Autoscaler (CA) component will automatically increase or decrease the number of nodes in the cluster based on the actual load. If the nodes in the cluster are automatically scaled out, users can trace the entire scaling process through event search.
1. Log in to the TKE console.
2. On the left sidebar, click Log Management > Event Log.
3. On the Event Search page, click the Global Search tab, and enter the following search command:
event.source.component : "cluster-autoscaler"
4. In the Hidden Field on the left side, select event.reason, event.message, event.involvedObject.name, and event.involvedObject.name to display. Sort the query results in reverse order by Log Time.
According to the event flow, you can find that the node scaling occurred around 2020-11-25 20:35:45 and was triggered by three NGINX pods (nginx-5dbf784b68-tq8rd, nginx-5dbf784b68-fpvbx, and nginx-5dbf784b68-v9jv5). After three nodes were scaled out, the subsequent scaling was not triggered because the number of nodes in the node pool reached the upper limit.

Was this page helpful?

You can also Contact Sales or Submit a Ticket for help.

Yes

tencent cloud

New User Offers

Next-Generation CDN：EdgeOne

Elasticsearch Service Special Offers

Free Tier

Tencent Cloud Startup Program

Special Offers

Lighthouse Special Offers

Cloud Object Storage Special Offers

Featured Products

New Products

Education

Tencent Cloud Online Education Solutions

Gaming

Gaming Solution

Game Media Solutions

Financial Services

Financial Services Solution

Audio & Video

Audio/Video Solution

LVB Recording Solution

Interactive Classroom Solution

Interactive Live Streaming Solution

Audio Chat Social Networking Solution

Real Estate

Tencent Cloud LinkBase(Weiling)

E-commerce

E-commerce retail solutions

Compute

Cloud Virtual Machine

Auto Scaling

Batch Compute

CVM Dedicated Host

Database

TencentDB for MySQL

TencentDB for Redis®

TencentDB for CTSDB

TDSQL for MySQL

Data Transfer Service

TencentDB for MongoDB

TencentDB for PostgreSQL

TencentDB for SQL Server

TencentDB for TcaplusDB

Video Service

Cloud Streaming Services

Video on Demand

Media Processing Service

Cloud Application Rendering

Cloud Contact Center

Game Multimedia Engine

Chat

Real-time Communication

Tencent Effect SDK

AI and Machine Learning

Image Creation Large Model

Face Fusion

eKYC

Optical Character Recognition

Video Creation Large Model

Industry Applications

Tencent HealthCare Omics Platform

Container and Middleware

TDMQ for CKafka

Serverless Cloud Function

Tencent Kubernetes Engine

Tencent Kubernetes Engine for Serverless

Networking

Cloud Load Balancer

Virtual Private Cloud

Direct Connect

Cloud Connect Network

NAT Gateway

VPN Connection

Bandwidth Package

Anycast Internet Acceleration

Elastic Network Interface

Flow Logs

Global Application Acceleration Platform

Security

Captcha