tencent cloud

All product documents
Stream Compute Service
Monitoring Metric List
Last updated: 2023-11-07 18:09:05
Monitoring Metric List
Last updated: 2023-11-07 18:09:05
The monitoring metric list provides the meanings of all metrics, helping you use the monitoring feature in Stream Compute Service.

Monitoring metric list

Note
You can view the following metrics in TCOP console > Stream Compute Service, and configure alarms here.
Metric
Description
Example Value
job_records_in_per_second
The total number of records the job receives from all sources per second.
22478.14 Record/s
job_records_out_per_second
The total number of records the job emits to all sinks per second.
12017.09 Record/s
job_bytes_in_per_second
The total number of bytes the job receives from all sources (Kafka sources only) per second.
786576 Byte/s
job_bytes_out_per_second
The total number of bytes the job emits to all sinks (Kafka sinks only) per second.
156872 Byte/s
job latency
The total latency it takes the data to flow through all operators. Sample errors may exist, so the value is for reference only.
275 ms
job_service_delay
The difference between the current timestamp and the watermark at the sink (if there are multiple sinks, the maximum difference is used).
5432 ms
job_cpu_load
The average CPU utilization of all TaskManagers of the job.
23.85%
taskmanager_status_jvm_memory_heap_used_percentage
The average heap memory utilization of all TaskManagers of the job.
57.12%
taskmanager_status_jvm_memory_heap_used
The total heap memory used of all TaskManagers of the job.
830897056.00 Bytes
taskmanager_memory_heap_committed
The total heap memory committed of all TaskManagers of the job.
4937220096.00 Bytes
taskmanager_memory_heap_max
The total max heap memory of all TaskManagers of the job.
4937220096.00 Bytes
taskmanager_status_jvm_memory_nonheap_used
The total non-heap memory (JVM metaspace and code cache) used of all TaskManagers of the job.
296651064.00 Bytes
taskmanager_memory_nonheap_committed
The total non-heap memory (JVM metaspace and code cache) committed of all TaskManagers of the job.
103219200.00 Bytes
taskmanager_status_jvm_memory_nonheap_max
The total max non-heap memory (JVM metaspace and code cache) of all TaskManagers of the job.
780140544.00 Bytes
taskmanager_status_jvm_memory_process_memoryused
The max JVM memory (RSS) of all TaskManagers of the job, including heap, non-heap, native, and other areas. This metric is used to give an early warning for OOM Killed events in a Pod.
3597035110.00 Bytes
taskmanager_memory_direct_count
The sum of buffers in the direct buffer pools of all TaskManagers of the job.
10993.00 Items
taskmanager_memory_direct_used
The total direct buffer pools used of all TaskManagers of the job.
360328431.00 Bytes
taskmanager_memory_direct_max
The total max direct buffer pools of all TaskManagers of the job.
360328431.00 Bytes
taskmanager_memory_mapped_count
The sum of buffers in the mapped buffer pools of all TaskManagers of the job.
4 Items
taskmanager_memory_mapped_used
The total mapped buffer pools used of all TaskManagers of the job.
33554432.00 Bytes
taskmanager_memory_mapped_max
The total max mapped buffer pools of all TaskManagers of the job.
33554432.00 Bytes
jobmanager_jvm_old_gc_count
The old GC count of the JobManager of the job.
3.00 Times
jobmanager_jvm_old_gc_time
The old GC time of the JobManager of the job.
701.00 ms
jobmanager_jvm_young_gc_count
The young GC count of the JobManager of the job.
53.00 Times
jobmanager_jvm_young_gc_time
The young GC time of the JobManager of the job.
4094.00 ms
job_lastcheckpointduration
The time taken to make the last checkpoint of the job.
723.00 ms
job_lastcheckpointsize
The size of the last checkpoint of the job.
751321.00 Bytes
taskmanager_jvm_old_gc_count
The sum of old GC counts of all TaskManagers of the job.
9.00 Times
taskmanager_jvm_old_gc_time
The sum of old GC time of all TaskManagers of the job.
2014.00 ms
taskmanager_jvm_young_gc_count
The sum of young GC counts of all TaskManagers of the job.
889.00 Times
taskmanager_jvm_young_gc_time
The sum of young GC time of all TaskManagers of the job.
15051.00 ms
job_numberofcompletedcheckpoints
The number of successful checkpoints of the job.
11.00 Times
job_numberoffailedcheckpoints
The number of failed checkpoints of the job.
1.00 Time
job_numberofinprogresscheckpoints
The number of checkpoints in progress (not completed) of the job.
1.00 Time
job_totalnumberofcheckpoints
The total number of checkpoints (in progress, completed, and failed) of the job.
13.00 Times
job_numrecordsinbutfailed
The number of failed records (such as raising various exceptions) in the operator. If its value is greater than 1, the semantics of Exactly-Once will be affected. It is a testing parameter for reference only.
0.00 Times
jobmanager_job_numrestarts
The recorded number of job restarts due to crash (excluding restart of the job after the JobManager exits) of the JobManager of the job.
10.00 Times
jobmanager_status_jvm_memory_heap_used_percentage
The heap memory utilization of the JobManager of the job.
31.34%
jobmanager_memory_heap_used
The heap memory used of the JobManager of the job.
1040001560.00 Bytes
jobmanager_memory_heap_committed
The heap memory committed of the JobManager of the job.
3318218752.00 Bytes
jobmanager_memory_heap_max
The max heap memory of the JobManager of the job.
3318218752.00 Bytes
jobmanager_status_jvm_memory_nonheap_used
The non-heap memory (JVM metaspace and code cache) used of the JobManager of the job.
117362656.00 Bytes
jobmanager_memory_nonheap_committed
The non-heap memory (JVM metaspace and code cache) committed of the JobManager of the job.
122183680.00 Bytes
jobmanager_status_jvm_memory_nonheap_max
The max non-heap memory (JVM metaspace and code cache) of the JobManager of the job.
780140544.00 Bytes
jobmanager_status_jvm_memory_used
The JVM memory used (RSS) of the JobManager of the job, including heap, non-heap, native and other areas. This metric is used to give an early warning for OOM Killed events in a Pod.
3597035110.00 Bytes
jobmanager_cpu_load
The CPU utilization of the JobManager of the job.
7.12%
jobmanager_cpu_time
The CPU service time (ms) of the JobManager of the job.
834490.00 ms
jobmanager_downtime
For a non-running (failed or recovering) job, the duration of this downtime; for a running job, the value of this metric is 0.
1088466.00 ms
job_uptime
For a running job, the duration of continuous running of this job without interruption.
202305.00 ms
job_restartingtime
The time taken for the last restart of the job.
197181.00 ms
jobmanager_lastcheckpointrestoretimestamp
The Unix timestamp of the last job recovery from checkpoint (in ms), whose value will be -1 if no recovery is performed.
1621934344137.00 ms
jobmanager_memory_mapped_count
The number of buffers in the mapped buffer pool of the JobManager of the job.
4.00 Items
jobmanager_memory_mapped_memoryused
The mapped buffer pool used of the JobManager of the job.
33554432.00 Bytes
jobmanager_memory_mapped_totalcapacity
The max mapped buffer pool of the JobManager of the job.
33554432.00 Bytes
jobmanager_memory_direct_count
The number of buffers in the direct buffer pool of the JobManager of the job.
22.00 Items
jobmanager_memory_direct_memoryused
The direct buffer pool used of the JobManager of the job.
575767.00 Bytes
jobmanager_memory_direct_totalcapacity
The max direct buffer pool of the JobManager of the job.
577814.00 Bytes
jobmanager_numregisteredtaskmanagers
The number of registered TaskManagers of the job, which is generally equal to the max operator parallelism. The decline in the number of TaskManagers indicates that some TaskManagers are disconnected, and the job may crash and try to recover.
3.00 TaskManagers
jobmanager_numrunningjobs
The number of running jobs, with 1 for proper job running and 0 for job crash.
1.00 Job
jobmanager_taskslotsavailable
The number of task slots available, with 0 for proper job running and a value other than 0 for possible non-running of the job for a short period of time.
0.00 Slots
jobmanager_taskslotstotal
In Stream Compute Service, a TaskManager has only one task slot, so the total number of task slots is equal to the number of registered TaskManagers.
3.00 Slots
jobmanager_threads_count
The number of active threads in the JobManager of the job, including daemon and non-daemon threads.
77.00 Threads
taskmanager_cpu_time
The CPU service time (ms) of all TaskManagers of the job.
2029230.00 ms
taskmanager_network_availablememorysegments
The sum of memory segments available in all TaskManagers of the job.
32890.00 Items
taskmanager_network_totalmemorysegments
The sum of total memory segments assigned to all TaskManagers of the job.
32931.00 Items
taskmanager_threads_count
The total number of active threads in all TaskManagers of the job, including daemon and non-daemon threads.
207.00 Threads
job_lastcheckpointsize
The size of the last checkpoint.
1,024 Bytes
job_lastcheckpointduration
The time taken to make the last checkpoint.
100ms
job_numberoffailedcheckpoints
The number of failed checkpoints.
50 Bytes
JM CPU Load
The JVM CPU utilization of the JobManager.
12%
JM Heap Memory
The heap memory usage of the JobManager.
50 Bytes
JM GC Count
Status.JVM.GarbageCollector.<GarbageCollector>.Count of the JobManager, representing the GC count of the JobManager.
5 times
JM GC Time
Status.JVM.GarbageCollector.<GarbageCollector>.Time of the JobManager, representing the GC time of the JobManager.
64ms
TaskManager CPU Load
The JVM CPU utilization of the selected TaskManager.
70%
TaskManager Heap Memory
The heap memory usage of the selected TaskManager.
50 bytes
TaskManager GC Count
Status.JVM.GarbageCollector.<GarbageCollector>.Count of the selected TaskManager, representing the GC count of the TaskManager.
5 times
TaskManager GC Time
Status.JVM.GarbageCollector.<GarbageCollector>.Time of the selected TaskManager, representing the GC time of the TaskManager.
5ms
Task OutPoolUsage
The percentage of output queues. When this metric reaches 100%, the task is backpressured.
64%
Task OutputQueueLength
The number of output queues.
6
Task InPoolUsage
The percentage of input queues. When this metric reaches 100%, the task is backpressured.
64%
Task InputQueueLength
The number of input queues.
6
Task CurrentInputWatermark
The current watermark of the task.
1623814418
Data import time (ETL)
The delay of a source taking the data in the job.
10 ms
job_records_in_per_second ‍(ETL)
The total rate of all sources in the job.
342 Records/s
SourceIdleTime (ETL)
The interval between data batches processed by a source in the job, which indirectly reflects the idle time of the source.
24532223 ms
SynDelay (ETL)
The delay of a source taking the data and processing it in the job.
1345 ms
BinLogPos (ETL)
The MySQL binary log coordinates or PostgreSQL log sequence number (LSN) of the job.
260690147
job latency (ETL)
The average delay between the sink and source operators of the job.
49 ms
DbFlushDelay (ETL)
The sum of the database flush delay and async callback time of the job.
30 ms
job_records_out_per_second (ETL)
The total rate of all sinks in the job.
234 Records/s
Source - full sync (ETL)
The full data sync progress of the job.
30%
Source - incremental sync (ETL)
For MySQL, sync delay refers to the gap between the binlog coordinates of the current source and the latest binlog coordinates of the MySQL instance ‍source ‍collected in the last sampling; for PostgreSQL, sync delay refers to the gap between the LSN of the current source and the latest LSN of the PostgreSQL instance source collected in the last sampling.
205
Kafka - records_lag max
The maximum of kafka-lag-max (the difference of Kafka producer and consumer offsets) reported by the TaskManager.
100
Kafka - records_lag min
The minimum of kafka-lag-max (the difference of Kafka producer and consumer offsets) reported by the TaskManager.
50
Kafka - records_lag mean
The mean of kafka-lag-max (the difference of Kafka producer and consumer offsets) reported by the TaskManager.
80
Kafka - records_lag sum
The sum of kafka-lag-max (the difference of Kafka producer and consumer offsets) reported by the TaskManager.
500
CurrentFetchEventtimeLag ‍ ‍(ms)
Formula: FetchTime (the time the source fetches the data) − EventTime (data event time). This metric reflects the retention of data in the external system.
10
CurrentEmitEventtimeLag ‍(ms)
Formula: EmitTime (the time the data leaves the source) − EventTime (data event time). This metric reflects the retention of data between the external system and the Source.
20
taskmanager_job_task_backpressuredtimemspersecond (%)
The maximum of all subtask backpressure percentages in the job.
30%
taskmanager_job_task_dataskewcoefficient
This metric is the coefficient of variation (= standard deviation/mean) of subtask inputs of each job. A value less than 10% represents a weak skew.
10%

Was this page helpful?
You can also Contact Sales or Submit a Ticket for help.
Yes
No

Feedback

Contact Us

Contact our sales team or business advisors to help your business.

Technical Support

Open a ticket if you're looking for further assistance. Our Ticket is 7x24 avaliable.

7x24 Phone Support
Hong Kong, China
+852 800 906 020 (Toll Free)
United States
+1 844 606 0804 (Toll Free)
United Kingdom
+44 808 196 4551 (Toll Free)
Canada
+1 888 605 7930 (Toll Free)
Australia
+61 1300 986 386 (Toll Free)
EdgeOne hotline
+852 300 80699
More local hotlines coming soon