NoteTo provide better and more powerful product capabilities, TPS will be merged and upgraded into Tencent Managed Service for Prometheus (TMP). The new TMP service supports cross-region and cross-VPC monitoring and connecting a unified Grafana dashboard to multiple TMP instances for data display in one place. For more information on TMP billing, see Pay-as-You-Go. For Tencent Cloud resource usage details, see Billing Mode and Resource Usage. Free metrics for basic monitoring will not be billed.
TPS will be deactivated on May 16, 2022. For more information, see Announcements. Click here to try out the launched TMP service. TPS instances can no longer be created. You can use our quick migration tool to migrate your TPS instances to TMP. Before the migration, streamline monitoring metrics or reduce the collection frequency first; otherwise, higher costs may be incurred.
This document describes how to streamline the TPS collection metrics to avoid unnecessary expenses after the migration to TMP.
Before configuring monitoring collection items, you need to perform the following operations:
TMP offers more than 100 free basic monitoring metrics as listed in Free Metrics in Pay-as-You-Go Mode.
Currently, TMP is billed by the number of monitoring data points. We recommend you optimize your collection configuration to collect only required metrics and filter out unnecessary ones. This will save costs and reduce the overall reported data volume. For more information on the billing mode and Tencent Cloud resource usage, see here.
The following describes how to add filters for ServiceMonitors, PodMonitors, and RawJobs to streamline custom metrics.
A ServiceMonitor and a PodMonitor use the same filtering fields, and this document uses a ServiceMonitor as an example.
Sample for ServiceMonitor:
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
labels:
app.kubernetes.io/name: kube-state-metrics
app.kubernetes.io/version: 1.9.7
name: kube-state-metrics
namespace: kube-system
spec:
endpoints:
- bearerTokenSecret:
key: ""
interval: 15s # This parameter is the collection frequency. You can increase it to reduce the data storage costs. For example, you can set it to `300s` for less important metrics, which can reduce the amount of monitoring data collected by 20 times.
port: http-metrics
scrapeTimeout: 15s
jobLabel: app.kubernetes.io/name
namespaceSelector: {}
selector:
matchLabels:
app.kubernetes.io/name: kube-state-metrics
To collect kube_node_info
and kube_node_role
metrics, you need to add the metricRelabelings
field to the Endpoint list of the ServiceMonitor. Note that it is metricRelabelings
but not relabelings
.
Sample for adding metricRelabelings
:
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
labels:
app.kubernetes.io/name: kube-state-metrics
app.kubernetes.io/version: 1.9.7
name: kube-state-metrics
namespace: kube-system
spec:
endpoints:
- bearerTokenSecret:
key: ""
interval: 15s # This parameter is the collection frequency. You can increase it to reduce the data storage costs. For example, you can set it to `300s` for less important metrics, which can reduce the amount of monitoring data collected by 20 times.
port: http-metrics
scrapeTimeout: 15s # This parameter is the collection timeout period. TMP configuration requires that this value not exceed the collection interval, i.e., `scrapeTimeout` <= `interval`.
# The following four lines are added:
metricRelabelings: # Each collected item is subject to the following processing.
- sourceLabels: ["__name__"] # The name of the label to be detected. `__name__` indicates the name of the metric or any label that comes with the item.
regex: kube_node_info|kube_node_role # Whether the above label satisfies this regex. Here, `__name__` should satisfy the requirements of `kube_node_info` or `kube_node_role`.
action: keep # Keep the item if it meets the above conditions; otherwise, drop it.
jobLabel: app.kubernetes.io/name
namespaceSelector: {}
selector:
If Prometheus' RawJob is used, see the following method for metric filtering.
Sample job:
scrape_configs:
- job_name: job1
scrape_interval: 15s # This parameter is the collection frequency. You can increase it to reduce the data storage costs. For example, you can set it to `300s` for less important metrics, which can reduce the amount of monitoring data collected by 20 times.
static_configs:
- targets:
- '1.1.1.1'
If you only need to collect kube_node_info
and kube_node_role
metrics, add the metric_relabel_configs
field. Note that it is metric_relabel_configs
but not relabel_configs
.
Sample for adding metric_relabel_configs
:
scrape_configs:
- job_name: job1
scrape_interval: 15s # This parameter is the collection frequency. You can increase it to reduce the data storage costs. For example, you can set it to `300s` for less important metrics, which can reduce the amount of monitoring data collected by 20 times.
static_configs:
- targets:
- '1.1.1.1'
# The following four lines are added:
metric_relabel_configs: # Each collected item is subject to the following processing.
- source_labels: ["__name__"] # The name of the label to be detected. `__name__` indicates the name of the metric or any label that comes with the item.
regex: kube_node_info|kube_node_role # Whether the above label satisfies this regex. Here, `__name__` should satisfy the requirements of `kube_node_info` or `kube_node_role`.
action: keep # Keep the item if it meets the above conditions; otherwise, drop it.
TPS will manage all the ServiceMonitors and PodMonitors in a cluster by default after the cluster is associated. If you want to block the monitoring of a namespace, you can label it with tps-skip-monitor: "true"
as instructed in Labels and Selectors.
TPS collects monitoring data by creating CRD resources of ServiceMonitor and PodMonitor types in your cluster. If you want to block the collection of the specified ServiceMonitor and PodMonitor resources, you can add the label of tps-skip-monitor: "true"
to these CRD resources as instructed in Labels and Selectors.
Apakah halaman ini membantu?