tencent cloud

Feedback

Metric Storage Overview

Last updated: 2024-09-20 17:48:27

    Basic Concepts

    Metric

    Metrics are measurements used to assess the performance and operation of systems and applications, such as CPU utilization, memory utilization, access throughput, response time, and response success rate. Metrics are typically generated at regular intervals, producing a value at each point in time. Over time, these values form a sequence, which is commonly referred to as a time series.
    Cloud Log Service (CLS) is compatible with the Prometheus metrics data model, storing timestamped metric data with the same metric name and labels as time series. Each data point in the time series is referred to as a sample, which consists of a timestamp and a sample value.
    For example, the total number of requests for a particular API at 15:35:23.123 on December 30, 2020 would be considered a sample, with the data as follows:
    requests_total{method="POST", handler="/messages"} 217
    It is composed of the following parts:
    Metric name: requests_total
    Label: {method="POST", handler="/messages"} (indicating the API name is messages and the request method is POST.)
    Timestamp: 2020/12/30 15:35.123
    Sample value: 217
    The monitored system often has multiple metrics at the same time, with many different metric names and labels at a given moment. For example, Nginx monitoring metrics might include the following:
    # HELP nginx_http_requests_total The total number of HTTP requests
    # TYPE nginx_http_requests_total counter
    nginx_http_requests_total 10234
    
    # HELP nginx_http_requests_duration_seconds The HTTP request duration in seconds
    # TYPE nginx_http_requests_duration_seconds histogram
    nginx_http_requests_duration_seconds_bucket{le="0.005"} 2405
    nginx_http_requests_duration_seconds_bucket{le="0.01"} 5643
    nginx_http_requests_duration_seconds_bucket{le="0.025"} 7890
    nginx_http_requests_duration_seconds_bucket{le="0.05"} 9234
    nginx_http_requests_duration_seconds_bucket{le="0.1"} 10021
    nginx_http_requests_duration_seconds_bucket{le="0.25"} 10234
    nginx_http_requests_duration_seconds_bucket{le="0.5"} 10234
    nginx_http_requests_duration_seconds_bucket{le="1"} 10234
    nginx_http_requests_duration_seconds_bucket{le="2.5"} 10234
    nginx_http_requests_duration_seconds_bucket{le="5"} 10234
    nginx_http_requests_duration_seconds_bucket{le="10"} 10234
    nginx_http_requests_duration_seconds_bucket{le="+Inf"} 10234
    nginx_http_requests_duration_seconds_sum 243.56
    nginx_http_requests_duration_seconds_count 10234
    
    # HELP nginx_http_connections Number of HTTP connections
    # TYPE nginx_http_connections gauge
    nginx_http_connections{state="active"} 23
    nginx_http_connections{state="reading"} 5
    nginx_http_connections{state="writing"} 7
    nginx_http_connections{state="waiting"} 11
    
    # HELP nginx_http_response_count_total The total number of HTTP responses sent
    # TYPE nginx_http_response_count_total counter
    nginx_http_response_count_total{status="1xx"} 123
    nginx_http_response_count_total{status="2xx"} 9123
    nginx_http_response_count_total{status="3xx"} 456
    nginx_http_response_count_total{status="4xx"} 567
    nginx_http_response_count_total{status="5xx"} 65
    
    # HELP nginx_up Is the Nginx server up
    # TYPE nginx_up gauge
    nginx_up 1
    The meanings of the metrics are as follows:
    nginx_http_requests_total: The total number of HTTP requests processed by Nginx.
    nginx_http_requests_duration_seconds: The duration of HTTP requests, provided using the Histogram type, which shows the number of requests within different time intervals.
    nginx_http_connections: The current number of HTTP connections in Nginx, categorized into active, reading, writing, and waiting status.
    nginx_http_response_count_total: The total number of HTTP responses returned by Nginx, categorized by status code.
    nginx_up: The operation status of the Nginx server. 1 indicates that it is running; 0 indicates that it is not running.

    Metric Topic

    Refers to the fundamental unit for collecting, storing, searching, and analyzing metric data on the Cloud Log Service platform. The collected metric data is managed within metric topics, including configurations such as retention period and retrieval analysis. Metric topics are compatible with the Prometheus metrics data model and metric query API, functioning similarly to a Prometheus instance. As long as the metric names do not conflict and the data volume does not exceed product specifications and limits, metrics from different applications or services can be stored in the same metric topic. In practice, metric data from production, testing, and development environments of a business system are typically stored in separate metric topics.
    Note:
    The metric topics service ended its public beta on July 1, 2024, and is now officially a paid service. For more details, see the Billing Overview.

    Features

    Metric collection:
    Metric reporting: Supports the Prometheus Remote Write protocol, allowing various collectors compatible with this protocol, such as vmagent and telegraf, to collect and report metrics to the metric topic.
    Log to metric: Logs from a log topic can be converted into metrics using scheduled SQL queries. This approach is suitable for long-term, low-cost storage of key system metrics, and it often provides better performance when conducting visualization analysis based on these metrics.
    Cloud product metric subscription: Supports the proactive subscription to cloud product metrics from the TCOP, allowing centralized storage and querying within CLS. This enables a more flexible statistical analysis of cloud product metrics.
    Metric query: Use PromQL to query metrics.
    Metric visualization: You can use dashboards to visualize metric data in formats such as tables, time series charts, single value charts, and gauges. Additionally, you can use Grafana to display metric data directly.
    Monitoring and alarms: You can configure alarm policies for metric topics, notifying users via SMS, WeChat, phone calls, emails, and WeCom when anomalies in the metrics occur.

    Advantages

    Metric topics are compatible with the Prometheus metric data model and query API, enabling seamless integration with various Prometheus-compatible open-source projects, such as Grafana.
    Compared to a self-built Prometheus, it eliminates the need for deployment and maintenance, significantly reducing labor costs.
    It can be used in combination with logs to centrally collect, store, and analyze metrics and log data, enabling the construction of a unified monitoring platform and improving Ops efficiency.

    Fee Description

    For more details, see the Billing Overview.

    Specifications and Limits

    Restriction Item
    Description
    Metric name
    Supports English letters, numbers, underscores, and colons. It should conform to the regular expression [a-zA-Z_:][a-zA-Z0-9_:]*.
    Label name
    Supports English letters, numbers, and underscores. It should conform to the regular expression [a-zA-Z_][a-zA-Z0-9_]*.
    Label value
    No special restrictions, supporting all types of Unicode characters.
    Sample value
    A float64 type value
    Sample timestamp
    Millisecond precision
    Query Concurrency
    A single metric topic supports up to 15 concurrent queries.
    Query data volume
    A single query can involve up to 200,000 time series, with a maximum of 11,000 data points per time series in the query results.
    Metric upload frequency control
    25000QPS
    Metric upload flow control
    250MB/s
    
    Contact Us

    Contact our sales team or business advisors to help your business.

    Technical Support

    Open a ticket if you're looking for further assistance. Our Ticket is 7x24 avaliable.

    7x24 Phone Support