tencent cloud

All product documents
Elastic MapReduce
Kudu Monitoring Metrics
Last updated: 2023-12-27 14:51:37
Kudu Monitoring Metrics
Last updated: 2023-12-27 14:51:37

Kudu - overview

Title
Metric
Unit
Description
Tablets
TabletRunning
-
Total number of tablets currently running on all tablet servers
Difference in the number of tablet replicas
ClusterReplicaSkew
-
Difference between the number of replicas on the tablet server hosting the most replicas and the number of replicas on the tablet server hosting the fewest replicas
TServer threads
ThreadsRunning
-
Number of threads currently running on all tablet servers
Master threads
ThreadsRunning
-
Number of threads currently running on all masters
TServer logs
ErrorMessages
-
Number of ERROR-level log messages emitted in all processes
Master logs
ErrorMessages
-
Number of ERROR-level log messages emitted in all processes
WarningMessages
-
Number of WARNING-level log messages emitted in all processes
Oversized write requests
OversizedWriteRequests
-
Number of oversized write requests to the system catalog tablet rejected by the master since start

Kudu - server

Title
Metric
Unit
Description
Block cache hit
BlockCacheHit
-
Number of block cache hits. When confirming the cache efficiency, use the value of this metric instead of that of cache_hits
BlockCacheMiss
-
Number of block cache misses. When confirming the cache efficiency, use the value of this metric instead of that of cache_misses
Block cache utilization
BlockCacheUsage
bytes
Memory used by block cache
File cache hit
FileCacheHit
-
Number of file descriptor cache hits. When confirming the cache efficiency, use the value of this metric instead of that of cache_hits
FileCacheMiss
-
Number of file descriptor cache misses. When confirming the cache efficiency, use the value of this metric instead of that of cache_misses
File cache utilization
FileCacheUsage
-
Number of entries file cache
Scanner
ActiveScanners
-
Number of currently active scanners
ExpiredScanners
-
Number of scanners that have expired due to inactivity since service start
Block manager blocks
BlockUnderManagement
-
Number of currently managed data blocks
BlockOpenReading
-
Number of data blocks currently opened for read
BlockOpenWriting
-
Number of data blocks currently opened for write
Block manager bytes
BytesUnderManagement
bytes
Number of bytes of currently managed data blocks
Block manager containers
ContainersUnderManagement
-
Number of log block containers
FullContainersUnderManagement
-
Number of full log block containers
Tablet leaders
NumRaftLeaders
-
Number of tablet replicas that are Raft leaders
Tablet sessions
OpenClientSessions
-
Number of currently opened tablet copy client sessions on this server
OpemSourceSessions
-
Number of currently opened tablet copy source sessions on this server
Tablets
TabletBootstrapping
-
Number of currently bootstrapping tablets
TabletFailed
-
Number of failed tablets
TabletInitialized
-
Number of currently initialized tablets
TabletNotInitialized
-
Number of currently uninitialized tablets
TabletRunning
-
Number of currently running tablets/Number of currently running threads
TabletShutdown
-
Number of currently shut down tablets
TabletStopped
-
Number of currently stopped tablets
TabletStopping
-
Number of currently stopping tablets
CPU time
CpuStime
ms
Total system CPU time of process
CpuUtime
ms
Total user CPU time of process
Data path
DataDirsFailed
-
Number of data directories whose disks are currently in failed status
DataDirsFull
-
Number of data directories whose disks are currently full
Thread
ThreadsRunning
-
Number of currently running threads
Context
InvoluntarySwitches
-
Total involuntary context switches
VoluntarySwitches
-
Total voluntary context switches
Spinlock
SpinlockContentionTime
μs
Amount of time consumed by contention on internal spinlocks since server start
Log information
ErrorMessages
-
Number of ERROR-level log messages emitted by the application
WarningMessages
-
Number of WARNING-level log messages emitted by the application
Operations in queue
TotalCount
-
Total number
Min
-
Minimum number of tasks waiting in the queue
Max
-
Maximum number of tasks waiting in the queue
Mean
-
Average number of tasks waiting in the queue
Percentile_99_9
-
99.9th percentile of the number of tasks waiting in the queue
Operation execution duration
TotalCount
μs
Total number of operations
Min
μs
Minimum run time
Max
μs
Maximum run time
Mean
μs
Average run time
Percentile_99_9
μs
99.9th percentile of the run time
Queuing wait time
TotalCount
μs
Total number of operations
Min
μs
Minimum wait time
Max
μs
Maximum wait time
Mean
μs
Average wait time
Percentile_99_9
μs
99.9th percentile of the wait time
Allocated bytes
AllocatedBytes
bytes
Number of bytes used by applications. This usually does not match the memory usage reported by the operating system because it does not include TCMalloc overhead or memory fragments
Hybrid clock error
HybridClockError
μs
Server clock maximum error; returns 2^64-1 when unable to read the base clock
Hybrid clock timestamp
HybridClockTimestamp
μs
Hybrid clock timestamp; returns 2^64-1 when unable to read the base clock
TCMalloc memory
HeapSize
bytes
Bytes of system memory reserved by TCMalloc
CurrentThreadCacheBytes
bytes
A measure of some of the memory TCMalloc is using (for small objects)
TotalThreadCacheBytes
bytes
A limit to how much memory TCMalloc dedicates for small objects
TCMalloc PageHeap
FreeBytes
bytes
Number of bytes of free mapped pages in the page heap
UnMappedBytes
bytes
Number of bytes of free unmapped pages in the page heap
RPC request
ConnectionsAccepted
-
Number of incoming TCP connections made to the RPC server
QueueOverflow
-
Number of RPCs dropped because the service queue was full
TimesOutInQueue
-
Number of RPCs that timed out while waiting in the service queue and thus were not processed
RPC FetchData
TotalCount
μs
Total number of operations
Min
μs
Minimum processing time
Max
μs
Maximum processing time
Mean
μs
Average processing time
Percentile_99_9
μs
99.9th percentile of the processing time
RPC AlterSchema
TotalCount
μs
Total number of operations
Min
μs
Minimum processing time
Max
μs
Maximum processing time
Mean
μs
Average processing time
Percentile_99_9
μs
99.9th percentile of the processing time
RPC CreateTablet
TotalCount
μs
Total number of operations
Min
μs
Minimum processing time
Max
μs
Maximum processing time
Mean
μs
Average processing time
Percentile_99_9
μs
99.9th percentile of the processing time
RPC DeleteTablet
TotalCount
μs
Total number of operations
Min
μs
Minimum processing time
Max
μs
Maximum processing time
Mean
μs
Average processing time
Percentile_99_9
μs
99.9th percentile of the processing time
RPC Quiesce
TotalCount
μs
Total number of operations
Min
μs
Minimum processing time
Max
μs
Maximum processing time
Mean
μs
Average processing time
Percentile_99_9
μs
99.9th percentile of the processing time
RPC scan
TotalCount
μs
Total number of operations
Min
μs
Minimum processing time
Max
μs
Maximum processing time
Mean
μs
Average processing time
Percentile_99_9
μs
99.9th percentile of the processing time
RPC ScannerKeepAlive
TotalCount
μs
Total number of operations
Min
μs
Minimum processing time
Max
μs
Maximum processing time
Mean
μs
Average processing time
Percentile_99_9
μs
99.9th percentile of the processing time
RPC write
TotalCount
μs
Total number of operations
Min
μs
Minimum processing time
Max
μs
Maximum processing time
Mean
μs
Average processing time
Percentile_99_9
μs
99.9th percentile of the processing time
Write requests rejected due to queue overloading
QueueOverloadRejections
count
Number of write requests rejected due to queue overloading
Scan rate
ScannedFromDiskRate
bytes/s
Amount of data scanned per second
ScannerReturnedRate
bytes/s
Amount of data returned per second
Scanner bytes
ScannedFromDisk
bytes
Total amount of data scanned from disk
ScannerReturned
bytes
Total amount of returned data
Total row operations
RowsInserted
count
Number of rows inserted into the node
RowsDeleted
count
Number of rows deleted from the node
RowsUpserted
count
Number of rows upserted into the node
RowsUpdated
count
Number of rows updated on the node
Row operation rate
RowsInsertedRate
count/s
Number of rows inserted into the node per second
RowsDeletedRate
count/s
Number of rows deleted from the node per second
RowsUpsertedRate
count/s
Number of rows upserted into the node per second
RowsUpdatedRate
count/s
Number of rows updated on the node per second

Kudu - master

Title
Metric
Unit
Description
Block cache hit
BlockCacheHit
-
Number of block cache hits. When confirming the cache efficiency, use the value of this metric instead of that of cache_hits
BlockCacheMiss
-
Number of block cache misses. When confirming the cache efficiency, use the value of this metric instead of that of cache_misses
Block cache utilization
BlockCacheUsage
bytes
Memory used by block cache
File cache hit
FileCacheHit
-
Number of file descriptor cache hits. When confirming the cache efficiency, use the value of this metric instead of that of cache_hits
FileCacheMiss
-
Number of file descriptor cache misses. When confirming the cache efficiency, use the value of this metric instead of that of cache_misses
File cache utilization
FileCacheUsage
-
Number of entries file cache
Block manager blocks
BlockUnderManagement
-
Number of currently managed data blocks
BlockOpenReading
-
Number of data blocks currently opened for read
BlockOpenWriting
-
Number of data blocks currently opened for write
Block manager bytes
BytesUnderManagement
bytes
Number of bytes of currently managed data blocks
Block manager containers
ContainersUnderManagement
-
Number of log block containers
FullContainersUnderManagement
-
Number of full log block containers
CPU time
CpuStime
ms
Total system CPU time of process
CpuUtime
ms
Total user CPU time of process
Thread
ThreadsRunning
-
Number of currently running threads
Data path
DataDirsFailed
-
Number of data directories whose disks are currently in failed status
DataDirsFull
-
Number of data directories whose disks are currently full
Allocated bytes
AllocatedBytes
bytes
Number of bytes used by applications. This usually does not match the memory usage reported by the operating system because it does not include TCMalloc overhead or memory fragments
Log information
ErrorMessages
-
Number of ERROR-level log messages emitted by the application
WarningMessages
-
Number of WARNING-level log messages emitted by the application
Context
InvoluntarySwitches
-
Total involuntary context switches
VoluntarySwitches
-
Total voluntary context switches
Operations in queue
TotalCount
-
Total number
Min
-
Minimum number of tasks waiting in the queue
Max
-
Maximum number of tasks waiting in the queue
Mean
-
Average number of tasks waiting in the queue
Percentile_99_9
-
99.9th percentile of the number of tasks waiting in the queue
Queuing wait time
TotalCount
μs
Total number of operations
Min
μs
Minimum wait time
Max
μs
Maximum wait time
Mean
μs
Average wait time
Percentile_99_9
μs
99.9th percentile of the wait time
Operation execution duration
TotalCount
μs
Total number of operations
Min
μs
Minimum run time
Max
μs
Maximum run time
Mean
μs
Average run time
Percentile_99_9
μs
99.9th percentile of the run time
Spinlock
SpinlockContentionTime
μs
Amount of time consumed by contention on internal spinlocks since server start
Oversized write requests
OversizedWriteRequests
-
Number of oversized write requests to the system catalog tablet rejected since start
Hybrid clock error
HybridClockError
μs
Server clock maximum error; returns 2^64-1 when unable to read the base clock
Hybrid clock timestamp
HybridClockTimestamp
μs
Hybrid clock timestamp; returns 2^64-1 when unable to read the base clock
Difference in the number of tablet replicas
ClusterReplicaSkew
-
Difference between the number of replicas on the tablet server hosting the most replicas and the number of replicas on the tablet server hosting the fewest replicas
Tablet leaders
NumRaftLeaders
-
Number of tablet replicas that are Raft leaders
Tablet sessions
OpemSourceSessions
-
Number of currently opened tablet copy source sessions on this server
TCMalloc memory
HeapSize
bytes
Bytes of system memory reserved by TCMalloc
CurrentThreadCacheBytes
bytes
A measure of some of the memory TCMalloc is using (for small objects)
TotalThreadCacheBytes
bytes
A limit to how much memory TCMalloc dedicates for small objects
TCMalloc page heap
FreeBytes
bytes
Number of bytes of free mapped pages in the page heap
UnMappedBytes
bytes
Number of bytes of free unmapped pages in the page heap
RPC request
ConnectionsAccepted
-
Number of incoming TCP connections made to the RPC server
QueueOverflow
-
Number of RPCs dropped because the service queue was full
TimesOutInQueue
-
Number of RPCs that timed out while waiting in the service queue and thus were not processed
RPC RunLeaderElection
TotalCount
μs
Total number of operations
Min
μs
Minimum processing time
Max
μs
Maximum processing time
Mean
μs
Average processing time
Percentile_99_9
μs
99.9th percentile of the processing time
RPC ConnectToMaster
TotalCount
μs
Total number of operations
Min
μs
Minimum processing time
Max
μs
Maximum processing time
Mean
μs
Average processing time
Percentile_99_9
μs
99.9th percentile of the processing time
RPC Ping
TotalCount
μs
Total number of operations
Min
μs
Minimum processing time
Max
μs
Maximum processing time
Mean
μs
Average processing time
Percentile_99_9
μs
99.9th percentile of the processing time
RPC TSHeartbeat
TotalCount
μs
Total number of operations
Min
μs
Minimum processing time
Max
μs
Maximum processing time
Mean
μs
Average processing time
Percentile_99_9
μs
99.9th percentile of the processing time
RPC FetchData
TotalCount
μs
Total number of operations
Min
μs
Minimum processing time
Max
μs
Maximum processing time
Mean
μs
Average processing time
Percentile_99_9
μs
99.9th percentile of the processing time

Was this page helpful?
You can also Contact Sales or Submit a Ticket for help.
Yes
No

Feedback

Contact Us

Contact our sales team or business advisors to help your business.

Technical Support

Open a ticket if you're looking for further assistance. Our Ticket is 7x24 available.

7x24 Phone Support