tencent cloud

$0 14-Day TrialExperience EdgeOne for acceleration and security protection!

Feedback

Tencent Kubernetes Engine

Configuring Log Collection via Yaml

Last updated: 2023-03-14 18:19:11
This document describes how to use CRD to configure the log collection feature of a TKE Serverless cluster via YAML.

Prerequisites

Log in to the TKE console, and enable the log collection feature for the serverless cluster. For more information, see Enabling Log Collection..

Creating the CRD

To create a collection configuration, you only need to define the LogConfig CRD. The collection component will modify the corresponding CLS log topics based on changes to the LogConfig CRD and set the bound server group. The CRD format is as follows:

clsDetail Field Description

Note:
You cannot modify a specified topic.
If the collection type is selected as "Container File Path", the corresponding path cannot be a soft link. Otherwise, the actual path of the soft link will not exist in the collector's container, resulting in log collection failure.
clsDetail:
## If the log topic is created automatically, the names of logset and topic need to be specified at the same time.
logsetName: test ## CLS logset name. Logset for the name will be created automatically if there is not any. If there is the logset, log topic will be created under it.
topicName: test ## CLS log topic name. Log topic for the name will be created automatically if there is not any.

# Select an existing logset and log topic. If the logset is specified but the log topic is not, a log topic will be created automatically.
logsetId: xxxxxx-xx-xx-xx-xxxxxxxx ## The ID of the CLS logset. The logset needs to be created in advance in CLS.
topicId: xxxxxx-xx-xx-xx-xxxxxxxx ## CLS log topic ID. The log topic needs to be created in CLS in advance and should not be occupied by other collection configurations.

logType: json_log ## Log collection format. json_log: json format. delimiter_log: separator-based format. minimalist_log: full text in a single line. multiline_log: full text in multi lines. fullregex_log: full regex format. Default value: minimalist_log
logFormat: xxx ## Log formatting method
period: 30 ## Lifecycle in days. Value range: 1–3600. `3640` indicates permanent storage.
partitionCount: ## The number (an integer) of log topic partitions. Default value: `1`. Maximum value: `10`.
tags: ## Tag description list. This parameter is used to bind a tag to a log topic. Up to nine tag key-value pairs are supported, and a resource can be bound to only one tag key.
- key: xxx ## Tag key
value: xxx ## Tag value
autoSplit: false ## Whether to enable automatic split (Boolean type). Default value: `true`.
maxSplitPartitions:
storageType: hot ## Log topic storage class. Valid values: `hot` (STANDARD); `cold` (STANDARD_IA). Default value: `hot`.
excludePaths: ## Collection path blocklist
- type: File ## Type. Valid values: `File`, `Path`. 
value: /xx/xx/xx/xx.log ## The value of `type`
indexs: ## You can customize the indexing method and field when creating a topic.
- indexName: ## When a key value or metafield index needs to be configured for a field, the metafield `Key` does not need to be prefixed with `__TAG__.` and is consistent with the one when logs are uploaded. `__TAG__.` will be prefixed automatically for display in the console.
indexType: ## Field type. Valid values: `long`, `text`, `double`
tokenizer: ## Field delimiter. Each character represents a delimiter. Only English symbols and \\n\\t\\r are supported. For `long` and `double` fields, leave it empty. For `text` fields, we recommend you use @&?|#()='",;:<>[]{}/ \\n\\t\\r\\ as the delimiter.
sqlFlag: ## Whether the analysis feature is enabled for the field (Boolean)
containZH: ## Whether Chinese characters are contained (Boolean)
region: ap-xxx ## Topic region for cross-region shipping
userDefineRule: xxxxxx ## Custom collection rule, which is a serialized JSON string
extractRule: {} ## Extraction and filter rule. If `ExtractRule` is set, `LogType` must be set.


inputDetail Field Description

inputDetail:
type: container_stdout ## Log collection type, including container_stdout (container standard output), container_file (container file), and host_file (host file)

containerStdout: ## Container standard output
namespace: default ## The Kubernetes namespace of the container to be collected. Separate multiple namespaces by comma, for example, `default,namespace`. If this field is not specified, it indicates all namespaces. Note that this field cannot be specified if `excludeNamespace` is specified.
excludeNamespace: nm1,nm2 ## The Kubernetes namespace of the container to be excluded. Separate multiple namespaces by comma, for example, `nm1,nm2`. If this field is not specified, it indicates all namespaces. Note that this field cannot be specified if `namespace` is specified.
nsLabelSelector: environment in (production),tier in (frontend) ## Filter namespaces by namespace label
allContainers: false ## Whether to collect the standard output of all containers in the specified namespace. Note that if `allContainers=true`, you cannot specify `workload`, `includeLabels`, and `excludeLabels` at the same time.
container: xxx ## Name of the container of which the logs will be collected. If the name is empty, it indicates the log names of all matching containers will be collected. 
excludeLabels: ## Pods with the specified labels will be excluded. This field cannot be specified if `workload`, `namespace`, and `excludeNamespace` are specified.
key2: value2 ## Pods with multiple values of the same key can be matched. For example, if you enter `environment = production,qa`, Pods with the `production` or `qa` value of the `environment` key will be excluded. Separate multiple values by comma. If you also specify `includeLabels`, Pods in the intersection will be matched.

includeLabels: ## Pods with the specified labels will be collected. This field cannot be specified if `workload`, `namespace`, and `excludeNamespace` are specified.
key: value1 ## The `metadata` will be carried in the log collected based on the collection rule and reported to the consumer. Pods with multiple values of the same key can be matched. For example, if you enter `environment = production,qa`, Pods with the `production` or `qa` value of the `environment` key will be matched. Separate multiple values by comma. If you also specify `excludeLabels`, Pods in the intersection will be matched.

metadataLabels: ## Specify the Pod labels to be collected as the metadata. If this field is not specified, all Pod labels will be collected as the metadata.
- label1
customLabels: ## Custom metadata
label: l1

workloads:
- container: xxx ## Name of the container to collect. If this parameter is not specified, it indicates all containers in the workload Pod will be collected.
kind: deployment ## Workload type. Supported values include deployment, daemonset, statefulset, job, and cronjob.
name: sample-app ## Workload name
namespace: prod ## Workload namespace

containerFile: ## File in the container
namespace: default ## The Kubernetes namespace of the container to be collected. A namespace must be specified.
excludeNamespace: nm1,nm2 ## The Kubernetes namespace of the container to be excluded. Separate multiple namespaces by comma, for example, `nm1,nm2`. If this field is not specified, it indicates all namespaces. Note that this field cannot be specified if `namespace` is specified.
nsLabelSelector: environment in (production),tier in (frontend) ## Filter namespaces by namespace label
container: xxx ## The name of container of which the logs will be collected. The * indicates the log names of all matching containers will be collected.
logPath: /var/logs ## Log folder. Wildcards are not supported.
filePattern: app_*.log ## Log file name. It supports the wildcards "*" and "?". "*" matches multiple random characters, and "?" matches a single random character.
customLabels: ## Custom metadata
key: value
excludeLabels: ## Pods with the specified labels will be excluded. This field cannot be specified if `workload` is specified.
key2: value2 ## Pods with multiple values of the same key can be matched. For example, if you enter `environment = production,qa`, Pods with the `production` or `qa` value of the `environment` key will be excluded. Separate multiple values by comma. If you also specify `includeLabels`, Pods in the intersection will be matched.

includeLabels: ## Pods with the specified labels will be collected. This field cannot be specified if `workload` is specified.
key: value1 ## The `metadata` will be carried in the log collected based on the collection rule and reported to the consumer. Pods with multiple values of the same key can be matched. For example, if you enter `environment = production,qa`, Pods with the `production` or `qa` value of the `environment` key will be matched. Separate multiple values by comma. If you also specify `excludeLabels`, Pods in the intersection will be matched.
metadataLabels: ## Specify the Pod labels to be collected as the metadata. If this field is not specified, all Pod labels will be collected as the metadata.
- label1 ## Pod label
workload:
container: xxx ## Name of the container to collect. If this parameter is not specified, it indicates all containers in the workload Pod will be collected.
name: sample-app ## Workload name

Log Parsing Format

Full Text in a Single Line
Full Text in Multi Lines
Single line - full regex
Multiple lines - full regex
JSON Format
Separator Format
A log with full text in a single line means a full log occupies one line. When CLS collects logs, it uses \\n as a line break to end a log. For easier structural management, each log is provided with a default key value __CONTENT__. However, the log data itself is no longer structured, and the log fields are not extracted. The Time log attribute depends on the time the log is collected. For more information, see Full text in a single line.
Assume that the raw data of a log is as follows:
Tue Jan 22 12:08:15 CST 2019 Installed: libjpeg-turbo-static-1.2.90-6.el7.x86_64
A sample of LogConfig configuration is as follows:
apiVersion: cls.cloud.tencent.com/v1
kind: LogConfig
spec:
clsDetail:
topicId: xxxxxx-xx-xx-xx-xxxxxxxx
# Single-line log
logType: minimalist_log
The data collected to CLS is as follows:
__CONTENT__:Tue Jan 22 12:08:15 CST 2019 Installed: libjpeg-turbo-static-1.2.90-6.el7.x86_64
A log with full text in multi lines means that a full log may occupy multiple lines (such as Java stacktrace). In this format, the line break \\n cannot be used as the end mark of a log. To help the CLS system distinguish among the logs, "First Line Regular Expression" is used for matching. When a log in a line matches the preset regular expression, it is considered to be the beginning of a log, and the next matching line will be the end mark of the log. In this format, a default key value __CONTENT__ is also set. However, the log data itself is no longer structured, and the log fields are not extracted. The Time log attribute depends on the time the log is collected. For more information, see Full text in multi lines.
Assume that the raw data of a multi-line log is:
2019-12-15 17:13:06,043 [main] ERROR com.test.logging.FooFactory:
java.lang.NullPointerException
at com.test.logging.FooFactory.createFoo(FooFactory.java:15)
at com.test.logging.FooFactoryTest.test(FooFactoryTest.java:11)

A sample of LogConfig is as follows:
apiVersion: cls.cloud.tencent.com/v1
kind: LogConfig
spec:
clsDetail:
topicId: xxxxxx-xx-xx-xx-xxxxxxxx
# Multi-line log
logType: multiline_log
extractRule:
# Only a line that starts with a date time is considered the beginning of a new log. Otherwise, add the line break `\\n` to the end of the current log.
beginningRegex: \\d{4}-\\d{2}-\\d{2}\\s\\d{2}:\\d{2}:\\d{2},\\d{3}\\s.+

The data collected to CLS is as follows:
\\_\\_CONTENT__:2019-12-15 17:13:06,043 [main] ERROR com.test.logging.FooFactory:\\njava.lang.NullPointerException\\n at com.test.logging.FooFactory.createFoo(FooFactory.java:15)\\n at com.test.logging.FooFactoryTest.test(FooFactoryTest.java:11)

Full RegEx is often used to process structured logs. It parses a full log by extracting multiple key-value pairs based on a regex. For more information, see Full RegEx. Assume that the raw data of a log is as follows:
10.135.46.111 - - [22/Jan/2019:19:19:30 +0800] "GET /my/course/1 HTTP/1.1" 127.0.0.1 200 782 9703 "http://127.0.0.1/course/explore?filter%5Btype%5D=all&filter%5Bprice%5D=all&filter%5BcurrentLevelId%5D=all&orderBy=studentNum" "Mozilla/5.0 (Windows NT 10.0; WOW64; rv:64.0) Gecko/20100101 Firefox/64.0" 0.354 0.354

A sample of LogConfig is as follows:
apiVersion: cls.cloud.tencent.com/v1
kind: LogConfig
spec:
clsDetail:
topicId: xxxxxx-xx-xx-xx-xxxxxxxx
# Full Regex
logType: fullregex_log
extractRule:
# Regular expression, in which the corresponding values will be extracted based on the `()` capture groups
logRegex: (\\S+)[^\\[]+(\\[[^:]+:\\d+:\\d+:\\d+\\s\\S+)\\s"(\\w+)\\s(\\S+)\\s([^"]+)"\\s(\\S+)\\s(\\d+)\\s(\\d+)\\s(\\d+)\\s"([^"]+)"\\s"([^"]+)"\\s+(\\S+)\\s(\\S+).*
beginningRegex: (\\S+)[^\\[]+(\\[[^:]+:\\d+:\\d+:\\d+\\s\\S+)\\s"(\\w+)\\s(\\S+)\\s([^"]+)"\\s(\\S+)\\s(\\d+)\\s(\\d+)\\s(\\d+)\\s"([^"]+)"\\s"([^"]+)"\\s+(\\S+)\\s(\\S+).*
# List of extracted keys, which are in one-to-one correspondence with the extracted values
keys: ['remote_addr','time_local','request_method','request_url','http_protocol','http_host','status','request_length','body_bytes_sent','http_referer','http_user_agent','request_time','upstream_response_time']

The data collected to CLS is as follows:
body_bytes_sent: 9703
http_host: 127.0.0.1
http_protocol: HTTP/1.1
http_referer: http://127.0.0.1/course/explore?filter%5Btype%5D=all&filter%5Bprice%5D=all&filter%5BcurrentLevelId%5D=all&orderBy=studentNum
http_user_agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:64.0) Gecko/20100101 Firefox/64.0
remote_addr: 10.135.46.111
request_length: 782
request_method: GET
request_time: 0.354
request_url: /my/course/1
status: 200
time_local: [22/Jan/2019:19:19:30 +0800]
upstream_response_time: 0.354

The multi-line - full regular expression mode is a log parsing mode where multiple key-value pairs can be extracted from a complete piece of log data that spans multiple lines in a log text file (such as Java program logs) based on a regular expression. If you don't need to extract key-value pairs, please configure it as instructed in full text in multi lines. For more information, see Full Text in Multi Lines.
Assume that the raw data of a log is as follows:
[2018-10-01T10:30:01,000] [INFO] java.lang.Exception: exception happened
at TestPrintStackTrace.f(TestPrintStackTrace.java:3)
at TestPrintStackTrace.g(TestPrintStackTrace.java:7)
at TestPrintStackTrace.main(TestPrintStackTrace.java:16)

A sample of LogConfig is as follows:
apiVersion: cls.cloud.tencent.com/v1
kind: LogConfig
spec:
clsDetail:
topicId: xxxxxx-xx-xx-xx-xxxxxxxx
# Multiple lines - full regex
logType: multiline_fullregex_log
extractRule:
# The first-line full regular expression: only a line that starts with a date time is considered the beginning of a new log. Otherwise, add the line break `\\n` to the end of the current log.
beginningRegex: \\[\\d+-\\d+-\\w+:\\d+:\\d+,\\d+\\]\\s\\[\\w+\\]\\s.*
# Regular expression, in which the corresponding values will be extracted based on the `()` capture groups
logRegex: \\[(\\d+-\\d+-\\w+:\\d+:\\d+,\\d+)\\]\\s\\[(\\w+)\\]\\s(.*)
# List of extracted keys, which are in one-to-one correspondence with the extracted values
keys:
- time
- level
- msg

Based on the extracted key, the data collected to CLS is as follows:
time: 2018-10-01T10:30:01,000`
level: INFO`
msg: java.lang.Exception: exception happened
at TestPrintStackTrace.f(TestPrintStackTrace.java:3)
at TestPrintStackTrace.g(TestPrintStackTrace.java:7)
at TestPrintStackTrace.main(TestPrintStackTrace.java:16)

A JSON log automatically extracts the key at the first layer as the field name and the value at the first layer as the field value to structure the entire log. A full log ends with a line break \\n. For more information, see JSON Format.
Assume the raw data of a JSON log is as follows:
{"remote_ip":"10.135.46.111","time_local":"22/Jan/2019:19:19:34 +0800","body_sent":23,"responsetime":0.232,"upstreamtime":"0.232","upstreamhost":"unix:/tmp/php-cgi.sock","http_host":"127.0.0.1","method":"POST","url":"/event/dispatch","request":"POST /event/dispatch HTTP/1.1","xff":"-","referer":"http://127.0.0.1/my/course/4","agent":"Mozilla/5.0 (Windows NT 10.0; WOW64; rv:64.0) Gecko/20100101 Firefox/64.0","response_code":"200"}

A sample of LogConfig is as follows:
apiVersion: cls.cloud.tencent.com/v1
kind: LogConfig
spec:
clsDetail:
topicId: xxxxxx-xx-xx-xx-xxxxxxxx
# JSON log
logType: json_log

The data collected to CLS is as follows:
agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:64.0) Gecko/20100101 Firefox/64.0
body_sent: 23
http_host: 127.0.0.1
method: POST
referer: http://127.0.0.1/my/course/4
remote_ip: 10.135.46.111
request: POST /event/dispatch HTTP/1.1
response_code: 200
responsetime: 0.232
time_local: 22/Jan/2019:19:19:34 +0800
upstreamhost: unix:/tmp/php-cgi.sock
upstreamtime: 0.232
url: /event/dispatch
xff: -

In this format, a log is structured based on the specified separator, and each complete log ends with a line break \\n. You need to define a unique key for each separate field for CLS to process separator-based logs. For more information, see Separator-based Format.
Assume the raw log is as follows:
10.20.20.10 ::: [Tue Jan 22 14:49:45 CST 2019 +0800] ::: GET /online/sample HTTP/1.1 ::: 127.0.0.1 ::: 200 ::: 647 ::: 35 ::: http://127.0.0.1/

A sample of LogConfig is as follows:
apiVersion: cls.cloud.tencent.com/v1
kind: LogConfig
spec:
clsDetail:
topicId: xxxxxx-xx-xx-xx-xxxxxxxx
# Separator log
logType: delimiter_log
extractRule:
# Separator
delimiter: ':::'
# List of extracted keys, which are in one-to-one correspondence to the separated fields
keys: ['IP','time','request','host','status','length','bytes','referer']

The data collected to CLS is as follows:
IP: 10.20.20.10
bytes: 35
host: 127.0.0.1
length: 647
referer: http://127.0.0.1/
request: GET /online/sample HTTP/1.1
status: 200
time: [Tue Jan 22 14:49:45 CST 2019 +0800]

Log Collection Types

Container standard output

Sample 1: collecting the standard output of all containers in the default namespace

apiVersion: cls.cloud.tencent.com/v1
kind: LogConfig
spec:
inputDetail:
type: container_stdout
containerStdout:
namespace: default
allContainers: true
...

Sample 2: collecting the container standard output in the Pod that belongs to ingress-gateway deployment in the production namespace

apiVersion: cls.cloud.tencent.com/v1
kind: LogConfig
spec:
inputDetail:
type: container_stdout
containerStdout:
allContainers: false
workloads:
- namespace: production
name: ingress-gateway
kind: deployment
...

Sample 3: collecting the container standard output in the Pod whose Pod labels contain “k8s-app=nginx” under the production namespace

apiVersion: cls.cloud.tencent.com/v1
kind: LogConfig
spec:
inputDetail:
type: container_stdout
containerStdout:
namespace: production
allContainers: false
includeLabels:
k8s-app: nginx
...

Container file

Sample 1: collecting the access.log file in the /data/nginx/log/ path in the nginx container in the Pod that belongs to ingress-gateway deployment under the production namespace

apiVersion: cls.cloud.tencent.com/v1
kind: LogConfig
spec:
topicId: xxxxxx-xx-xx-xx-xxxxxxxx
inputDetail:
type: container_file
containerFile:
namespace: production
workload:
name: ingress-gateway
type: deployment
container: nginx
logPath: /data/nginx/log
filePattern: access.log
...

Sample 2: collecting the access.log file in the /data/nginx/log/ path in the nginx container in the Pod whose pod labels contain “k8s-app=ingress-gateway” under the production namespace

apiVersion: cls.cloud.tencent.com/v1
kind: LogConfig
spec:
inputDetail:
type: container_file
containerFile:
namespace: production
includeLabels:
k8s-app: ingress-gateway
container: nginx
logPath: /data/nginx/log
filePattern: access.log
...

Metadata

For container standard output (container_stdout) and container files (container_file), in addition to the raw log content, the container metadata (for example, the ID of the container that generated the logs) also needs to be carried and reported to CLS. In this way, when viewing logs, users can trace the log source or search based on the container identifier or characteristics (such as container name and labels).
The following table lists the metadata:
Field
Description
cluster_id
ID of the cluster to which the log belongs
container_name
Name of the container to which the log belongs
image_name
Image name IP of the container to which the log belongs
namespace
Namespace of the Pod to which the log belongs
pod_uid
UID of the Pod to which the log belongs
pod_name
Name of the Pod to which the log belongs
pod_ip
IP of the Pod to which the log belongs
pod_lable_{label name}
Labels of the Pod to which the log belongs (for example, if a Pod has two labels: app=nginx and env=prod, the reported log will have two metadata entries attached: pod_label_app:nginx and pod_label_env:prod)

Contact Us

Contact our sales team or business advisors to help your business.

Technical Support

Open a ticket if you're looking for further assistance. Our Ticket is 7x24 avaliable.

7x24 Phone Support
Hong Kong, China
+852 800 906 020 (Toll Free)
United States
+1 844 606 0804 (Toll Free)
United Kingdom
+44 808 196 4551 (Toll Free)
Canada
+1 888 605 7930 (Toll Free)
Australia
+61 1300 986 386 (Toll Free)
EdgeOne hotline
+852 300 80699
More local hotlines coming soon