tencent cloud

All product documents
Tencent Kubernetes Engine
Monitoring Add-Ons Release Notes
Last updated: 2025-02-24 18:16:10
Monitoring Add-Ons Release Notes
Last updated: 2025-02-24 18:16:10

monitor-agent Release Notes

Change Time
Version Number
Change Content
Restrictions and Impacts
2024-11-28
v1.3.17
Added the timeout settings to fix the stuck issue that occurs when standalone-metrics obtains metrics.
The metric port and protocol are made adaptable to cluster upgrade.
Fixed the issue where obtaining mounted disk metrics gets stuck due to NFS failure.
This upgrade will not affect the existing business. During the upgrade, some add-ons may be unavailable, so it is recommended to perform the upgrade during off-peak periods.
2024-07-30
v1.3.16
Added the systemd mode for cadvisor.
Fixed the problem of repeated statistics in disk-related metrics calculation for native nodes.
Modified the Job podNormal logic for the Pod in the informer list-watch failed status.
This upgrade will not affect the existing business. During the upgrade, some add-ons may be unavailable, so it is recommended to perform the upgrade during off-peak periods.
2024-05-10
v1.3.14
Fixed the problem of excluding Pods in the Succeeded and Failed status during list-watch.
This upgrade will not affect the existing business. During the upgrade, some add-ons may be unavailable, so it is recommended to perform the upgrade during off-peak periods.
2024-03-18
v1.3.12
Exposed chart parameters to support onDelete policy upgrade.
This upgrade will not affect the existing business. During the upgrade, some add-ons may be unavailable, so it is recommended to perform the upgrade during off-peak periods.
2024-02-29
v1.3.11
Adapted to the GPU metrics, and changed the calculation method of GPU metrics at the Pod and node levels from aggregating container-level metrics to directly using the values exposed by the exporter.
Exposed add-ons tags for GPU metrics to preferentially use gpu-exporter: "true". If this tag is not available, use name: gpu-manager-ds.
Fixed the problem where a program panic is triggered when the GPU driver of a node is abnormal. GPU metrics will not be collected, and this will not affect collection of other basic metrics.
Fixed the problem where the program will get stuck in special cases when HTTP requests are sent to crane to pull data. The HTTP request will be canceled upon timeout.
Fixed the problem where the monitor-agent add-ons will run on different CPU cores at different time on large core nodes. The Pod's working_set metric will be too large over time, leading to OOM errors.
Fixed the problem where the monitoring add-ons fail to collect data due to changes of the /metrics port and protocol for later versions of controller-manager and scheduler, to adapt to the port changes of controller-manager and scheduler.
Modified the calculation method to exclude the iowait time for the high node I/O scenario where the calculated node CPU utilization is too high due to inclusion of the iowait time.
This upgrade will not affect the existing business. During the upgrade, some add-ons may be unavailable, so it is recommended to perform the upgrade during off-peak periods.
2024-02-4
v1.3.10
Extracted the monitor-agent privileged mode into a chart parameter. The privileged mode is disabled by default.
This upgrade will not affect the existing business. During the upgrade, some add-ons may be unavailable, so it is recommended to perform the upgrade during off-peak periods.
2023-08-17
v1.3.9
Fixed the problem where the workload is normal when the container is in the creating status.
Used the client-go mechanism to automatically refresh the Token to prevent Token expiration When kubeletJob is used to send requests to kubelet.
This upgrade will not affect the existing business. During the upgrade, some add-ons may be unavailable, so it is recommended to perform the upgrade during off-peak periods.
2023-04-25
v1.3.7
Fixed the problem where Pod-level GPU utilization (node) and GPU memory utilization (node) metrics fail to be collected properly, and the problem where Pods in the terminating status fail to be deleted due to container mounting to the host directory.
This upgrade will not affect the existing business. During the upgrade, some add-ons may be unavailable, so it is recommended to perform the upgrade during off-peak periods.
2023-03-21
v1.3.6
Added metrics of native nodes, including 1-minute load, total disk capacity, disk utilization, and write bandwidth of nodes.
This upgrade will not affect the existing business. During the upgrade, some add-ons may be unavailable, so it is recommended to perform the upgrade during off-peak periods.
2023-01-18
v1.3.5
Optimized the scenario where related monitoring metrics are not reported when cadvisor does not expose the container_fs_usage_bytes and container_fs_limit_bytes metrics.
This upgrade will not affect the existing business. During the upgrade, some add-ons may be unavailable, so it is recommended to perform the upgrade during off-peak periods.
2023-01-12
v1.3.4
Fixed the problem where the file system usage is 0 when the runtime is containerd.
This upgrade will not affect the existing business. During the upgrade, some add-ons may be unavailable, so it is recommended to perform the upgrade during off-peak periods.
2022-12-13
v1.3.3
Optimized the method of pulling basic monitoring metrics.
This upgrade will not affect the existing business. During the upgrade, some add-ons may be unavailable, so it is recommended to perform the upgrade during off-peak periods.
2022-11-08
v1.3.2
Fixed the problem where basic monitoring fail to report monitoring metrics properly.
This upgrade will not affect the existing business. During the upgrade, some add-ons may be unavailable, so it is recommended to perform the upgrade during off-peak periods.
2022-10-20
v1.3.1
Fixed the metric drop problem.
This upgrade will not affect the existing business. During the upgrade, some add-ons may be unavailable, so it is recommended to perform the upgrade during off-peak periods.
2022-08-25
v1.3.0
Tencent Kubernetes Engine (TKE) basic monitoring supports the following PVC monitoring metrics: PVC cloud disk size, PVC cloud disk utilization, and PVC cloud disk usage.
This upgrade will not affect the existing business. During the upgrade, some add-ons may be unavailable, so it is recommended to perform the upgrade during off-peak periods.
2022-08-09
v1.2.2
Updated the GPU metrics calculation method.
This upgrade will not affect the existing business. During the upgrade, some add-ons may be unavailable, so it is recommended to perform the upgrade during off-peak periods.
2022-07-28
v1.2.1
Updated the methods of calculating the node CPU packing rate and node memory packing rate.
This upgrade will not affect the existing business. During the upgrade, some add-ons may be unavailable, so it is recommended to perform the upgrade during off-peak periods.
2022-07-25
v1.2.0
Added the following metrics: Pod CPU optimizable amount, Pod memory optimizable amount, node CPU packing rate, and node memory packing rate.
This upgrade will not affect the existing business. During the upgrade, some add-ons may be unavailable, so it is recommended to perform the upgrade during off-peak periods.
2022-07-21
v1.1.1
Fixed the problem where the basic monitoring add-ons do not complete the collection, calculation, and reporting tasks within the corresponding cycles.
This upgrade will not affect the existing business. During the upgrade, some add-ons may be unavailable, so it is recommended to perform the upgrade during off-peak periods.
2022-07-05
v1.1.0
tke-monitor-agent mounts the host paths /proc/meminfo and /proc/cpuinfo to collect node CPU utilization and memory utilization.
This upgrade will not affect the existing business. During the upgrade, some add-ons may be unavailable, so it is recommended to perform the upgrade during off-peak periods.
2022-06-23
v1.0.0
Managed the basic monitoring add-ons by using chart.
This upgrade will not affect the existing business. During the upgrade, some add-ons may be unavailable, so it is recommended to perform the upgrade during off-peak periods.

clustermonitor Release Notes

Change Time
Version Number
Change Content
Restrictions and Impacts
2025-01-08
v1.3.2
Added the control plane add-on monitoring capability.
This upgrade will not affect the existing business. During the upgrade, add-ons may be unavailable, so it is recommended to perform the upgrade during off-peak periods.
2024-11-20
v1.2.0
Added the metric monitoring capability for native node sub-machines, submitting dimension service data.
Fixed the monitoring add-on exceptions in the CDC cluster.
This upgrade will not affect the existing business. During the upgrade, add-ons may be unavailable, so it is recommended to perform the upgrade during off-peak periods.
2024-10-30
v1.1.0
Allowed to enable measurement data reporting through the measure-enabled switch.
This upgrade will not affect the existing business. During the upgrade, add-ons may be unavailable, so it is recommended to perform the upgrade during off-peak periods.
2024-09-24
v1.0.13
Fixed the panic issue caused by clustermonitor failing to initialize proxy in CDC cluster scenarios.
Fixed the issue where the total GPU reported as 0 after the user modifies the node alias in the user cluster.
Fixed the issue where standaloneMetrics uses the new instanceid to report the node-related metrics instead after it supports nodes changing instanceid.
This upgrade will not affect the existing business. During the upgrade, add-ons may be unavailable, so it is recommended to perform the upgrade during off-peak periods.
2024-07-30
v1.0.12
Removed the dependency of master monitoring capability on token, and removed --token from startup parameters.
Fixed SSRF vulnerabilities.
Fixed the concurrency problem when the CPU and memory of cluster nodes are retrieved.
Fixed the problem of no value for the workload GPU utilization because the total cluster GPU is calculated as zero.
This upgrade will not affect the existing business. During the upgrade, some add-ons may be unavailable, so it is recommended to perform the upgrade during off-peak periods.
2024-03-27
v1.0.11
Allowed the managed cluster to report cluster storage object quantity metrics (pods, configmaps, and others).
Primarily collected data from gpu-exporter on each node during calculation of the total GPU core and GPU memory for the cluster. If no data can be collected, it can be obtained from the Status field of each node.
This upgrade will not affect the existing business. During the upgrade, some add-ons may be unavailable, so it is recommended to perform the upgrade during off-peak periods.
2023-11-05
v1.0.10
Allowed to collect metrics of three major add-ons in the managed cluster by using cluster-monitor.
This upgrade will not affect the existing business. During the upgrade, some add-ons may be unavailable, so it is recommended to perform the upgrade during off-peak periods.
2023-08-21
v1.0.9
Fixed the problem where the number of workload replicas associated with the Horizontal Pod Autoscaler (HPA) is scaled out to the maximum due to excessively high CPU usage when the HPA created by the user is based on the CPU usage in core resource metrics.
Allowed deployment of CDC scenarios to nodes.
This upgrade will not affect the existing business. During the upgrade, some add-ons may be unavailable, so it is recommended to perform the upgrade during off-peak periods.
2023-08-15
v1.0.8
Allowed deployment of CDC scenarios to user clusters.
This upgrade will not affect the existing business. During the upgrade, some add-ons may be unavailable, so it is recommended to perform the upgrade during off-peak periods.
2023-06-20
v1.0.7
Optimized cost metrics reporting logic.
This upgrade will not affect the existing business. During the upgrade, some add-ons may be unavailable, so it is recommended to perform the upgrade during off-peak periods.
2023-06-08
v1.0.6
Fixed the problem where the k8s_pod_ping_succeed metric is not reported when the Pod is not in the running status.
Fixed the problem where the data cache is not cleaned up when the number of data entries reported to barad exceeds 1,000.
This upgrade will not affect the existing business. During the upgrade, some add-ons may be unavailable, so it is recommended to perform the upgrade during off-peak periods.
2023-04-03
v1.0.5
Added annotation.service.kubernetes.io/qcloud-loadbalancer-multiplex : "true" for clustermonitor service to reuse ENILB in an independent cluster scenario with the inspection add-on.
This upgrade will not affect the existing business. During the upgrade, some add-ons may be unavailable, so it is recommended to perform the upgrade during off-peak periods.
2023-03-29
v1.0.4
Added the Node status, Pod Ready status, and cost metrics collection and reporting.
Optimized metric retrieval for the HPA data source hpa-metrics-server.
Upgraded the metrics-server version.
This upgrade will not affect the existing business. During the upgrade, some add-ons may be unavailable, so it is recommended to perform the upgrade during off-peak periods.
2023-03-24
v1.0.3
Fixed the problem of clustermonitor version upgrade failure.
This upgrade will not affect the existing business. During the upgrade, some add-ons may be unavailable, so it is recommended to perform the upgrade during off-peak periods.
2023-03-16
v1.0.2
Fixed the problem of apiserver CPU/mem utilization drop.
This upgrade will not affect the existing business. During the upgrade, some add-ons may be unavailable, so it is recommended to perform the upgrade during off-peak periods.
2023-03-14
v1.0.1
Managed the basic monitoring add-ons by using chart.
This upgrade will not affect the existing business. During the upgrade, some add-ons may be unavailable, so it is recommended to perform the upgrade during off-peak periods.


Was this page helpful?
You can also Contact Sales or Submit a Ticket for help.
Yes
No

Feedback

Contact Us

Contact our sales team or business advisors to help your business.

Technical Support

Open a ticket if you're looking for further assistance. Our Ticket is 7x24 avaliable.

7x24 Phone Support
Hong Kong, China
+852 800 906 020 (Toll Free)
United States
+1 844 606 0804 (Toll Free)
United Kingdom
+44 808 196 4551 (Toll Free)
Canada
+1 888 605 7930 (Toll Free)
Australia
+61 1300 986 386 (Toll Free)
EdgeOne hotline
+852 300 80699
More local hotlines coming soon