Compatible with the APIs of Prometheus and Grafana and the CRD usage of mainstream prometheus-operator, TKE Cloud Native Monitoring is more flexible and extensible. Combined with Prometheus open source tools, it can have more advanced usages.
This document describes how to use auxiliary scripts and migration tools to quickly migrate the self-built Prometheus to cloud native monitoring.
You have installed Kubectl on the node of the self-built Prometheus cluster and configured Kubeconfig to ensure that you can manage the cluster through Kubectl.
If the prometheus-operator is used in self-built Prometheus, CRD resources such as ServiceMonitor and PodMonitor are usually used to dynamically add collection configurations. This method also applies to cloud native monitoring. If you only need to migrate the prometheus-operator of the self-built Prometheus cluster to cloud native monitoring, and without migrating the cluster, then there is no need to migrate the dynamic configuration. You only need to use the cloud native monitoring to associate the self-built cluster, and then the ServiceMonitor and PodMonitor resources created by the self-built Prometheus will automatically take effect in cloud native monitoring.
For cross-cluster migration, you can export the CRD resources of self-built Prometheus and selectively reapply them in the associated cloud native monitoring cluster. The following describes how to export ServiceMonitor and PodMonitor in batches in a self-built Prometheus cluster.
Create the script prom-backup.sh
with the following contents:
_ns_list=$(kubectl get ns | awk '{print $1}' | grep -v NAME)
count=0
declare -a types=("servicemonitors.monitoring.coreos.com" "podmonitors.monitoring.coreos.com")
for _ns in ${_ns_list}; do
## loop for types
for _type in "${types[@]}"; do
echo "Backup type [namespace: ${_ns}, type: ${_type}]."
_item_list=$(kubectl -n ${_ns} get ${_type} | grep -v NAME | awk '{print $1}' )
## loop for items
for _item in ${_item_list}; do
_file_name=./${_ns}_${_type}_${_item}.yaml
echo "Backup kubernetes config yaml [namespace: ${_ns}, type: ${_type}, item: ${_item}] to file: ${_file_name}"
kubectl -n ${_ns} get ${_type} ${_item} -o yaml > ${_file_name}
count=$[count + 1]
echo "Backup No.${count} file done."
done;
done;
done;
Run the following command to run the prom-backup.sh
script.
bash prom-backup.sh
The prom-backup.sh
script will export each ServiceMonitor and PodMonitor resource into a separate YAML file. You can run the ls
command to view the output file list. The example is as follows:
$ ls
kube-system_servicemonitors.monitoring.coreos.com_kube-state-metrics.yaml
kube-system_servicemonitors.monitoring.coreos.com_node-exporter.yaml
monitoring_servicemonitors.monitoring.coreos.com_coredns.yaml
monitoring_servicemonitors.monitoring.coreos.com_grafana.yaml
monitoring_servicemonitors.monitoring.coreos.com_kube-apiserver.yaml
monitoring_servicemonitors.monitoring.coreos.com_kube-controller-manager.yaml
monitoring_servicemonitors.monitoring.coreos.com_kube-scheduler.yaml
monitoring_servicemonitors.monitoring.coreos.com_kube-state-metrics.yaml
monitoring_servicemonitors.monitoring.coreos.com_kubelet.yaml
monitoring_servicemonitors.monitoring.coreos.com_node-exporter.yaml
You can filter, modify and reapply the YAML file to the associated cloud native monitoring cluster (do not apply the collection rules that already exist or have the same feature). The cloud native monitoring will automatically perceive these dynamic collection rules and perform collection.
Note:If you need to add ServiceMonitor or PodMonitor, you can add it visually on the TKE console, or you can directly create it with YAML. The usage is fully compatible with the CRD of the Prometheus community.
If the self-built Prometheus system directly uses the Prometheus native configuration file, you can convert it into a RawJob of cloud native monitoring with a few steps on the TKE console, making it compatible with the scrape_configs
configuration item of the Prometheus native configuration file.
job_name
field of each Job.You can modify the Prometheus CRD resource of cloud native monitoring to modify the global configuration.
Run the following command to obtain the Prometheus information.
$ kubectl get ns
prom-fnc7bvu9 Active 13m
$ kubectl -n prom-fnc7bvu9 get prometheus
NAME VERSION REPLICAS AGE
tke-cls-hha93bp9 11m
$ kubectl -n prom-fnc7bvu9 edit prometheus tke-cls-hha93bp9
Run the following command to modify the Prometheus configuration.
$ kubectl -n prom-fnc7bvu9 edit prometheus tke-cls-hha93bp9
Modify the following parameters in the pop-up window:
The format of each Prometheus aggregation configuration rule is the same no matter it is the original static configuration Recording rules or the dynamic configuration PrometheusRule.
Note:If the self-built Prometheus uses the aggregation rules defined by PrometheusRule, it is recommended to migrate them according to the above steps. If the PrometheusRule resource is created directly in the cluster using YAML, it cannot be displayed in cloud native monitoring on the console currently.
This document provides the self-built Prometheus Alarm original configuration YAML file as an example to describe how to convert it into a monitoring configuration similar to cloud native monitoring.
- alert: NodeNotReady
expr: kube_node_status_condition{condition="Ready",status="true"} == 0
for: 5m
labels:
severity: critical
annotations:
description: node {{ $labels.node }} is not available for a long time (cluster id {{ $labels.cluster }})
{{ $labels.cluster }}
to represent the cluster ID.Note:The above alarm configuration example shows that after the node status changes to NotReady, the alarm will be pushed if it is not restored within 5 minutes. If it has not restored for a long time, the alarm will be pushed again at an interval of 1 hour.
The alarm channels of Tencent Cloud support SMS, Email, WeChat and Mobile. You can select as needed.
The self-built Prometheus is usually configured with many custom Grafana monitoring dashboards. If you need to migrate a large number of dashboards to other platforms, it is too inefficient to export and import one by one. You can use the grafana-backup tool to export and import Grafana dashboards in batches. For details, please refer to the following directions.
Run the following command to install grafana-backup, as shown below:
pip3 install grafana-backup
Note:It is recommended to use Python3 to avoid the compatibility problems.
Create API Keys.
Back up the configuration file of the dashboard that you want to export.
Run the following command to obtain the access address of the self-built Grafana, as shown below:
$ kubectl -n monitoring get svc
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
grafana ClusterIP 172.21.254.127 <none> 3000/TCP 25h
Note:Take the Grafana access address
http://172.21.254.127:3000
in the cluster as an example.
Run the following command to generate the grafana-backup configuration file (with Grafana address and APIKey) as shown below:
export TOKEN=<TOKEN>
cat > ~/.grafana-backup.json <<EOF
{
"general": {
"debug": true,
"backup_dir": "_OUTPUT_"
},
"grafana": {
"url": "http://172.21.254.127:3000",
"token": "${TOKEN}"
}
}
EOF
Note:You need to replace <TOKEN> with the APIKey of self-built Grafana, and replace the URL with the actual environment address.
Run the following command to export all dashboards, as shown below:
grafana-backup save
The dashboard will be saved as a compressed file in the _OUTPUT_
directory. You can run the following command to view the files in this directory, as shown below:
$ tree _OUTPUT_
_OUTPUT_
└── 202012151049.tar.gz
0 directories, 1 file
Run the following command to restore the configuration file, as shown below:
export TOKEN=<TOKEN>
cat > ~/.grafana-backup.json <<EOF
{
"general": {
"debug": true,
"backup_dir": "_OUTPUT_"
},
"grafana": {
"url": "http://prom-xxxxxx-grafana.ccs.tencent-cloud.com",
"token": "${TOKEN}"
}
}
EOF
Note:You need to replace <TOKEN> with the APIKey of cloud native monitoring Grafana, and replace the URL with the access address of cloud native monitoring Grafana. (The internet access need to be enabled).
Run the following command to import the exported dashboards to the cloud native monitoring Grafana with one click, as shown below:
grafana-backup restore _OUTPUT_/202012151049.tar.gz
In Grafana configuration dashboard, select Dashboard settings > Variables > New to create the cluster field. It is recommended to add the filter field “cluster” for all dashboards. Cloud native monitoring supports multiple clusters. It will add the label “cluster” to the data of each cluster, and use the cluster ID to distinguish different clusters, as shown below:
Note:Enter an arbitrary metric name that is involved in the current dashboard in label_values (The example is node_uname_info).
Modify the query statements of PromQL in all dashboards and add the filter conditions cluster=~"$cluster"
, as shown below:
Cloud native monitoring supports accessing self-built Grafana and AlertManager systems.
Cloud native monitoring provides Prometheus API. If you need to use self-built Grafana to display monitoring, you can add cloud native monitoring data as a Prometheus data source to self-built Grafana. You can find the Prometheus API address in the basic information of cloud native monitoring instance on TKE console.
Apakah halaman ini membantu?