tencent cloud

All product documents
Elastic MapReduce
HDFS Monitoring Metrics
Last updated: 2023-12-27 14:48:18
HDFS Monitoring Metrics
Last updated: 2023-12-27 14:48:18

HDFS - Overview

Title
Metric
Unit
Description
Cluster storage capacity
CapacityTotal
GB
Total cluster storage capacity
CapacityUsed
GB
Used cluster storage capacity
CapacityRemaining
GB
Remaining cluster storage capacity
CapacityUsedNonDFS
GB
Non-HDFS used cluster capacity
Cluster load
TotalLoad
1
Current connections
Total files in cluster
FilesTotal
-
Total number of files
Blocks
BlocksTotal
-
Total number of blocks
PendingReplicationBlocks
-
Number of blocks waiting to be backed up
UnderReplicatedBlocks
-
Number of blocks with insufficient replicas
CorruptBlocks
-
Number of corrupted blocks
ScheduledReplicationBlocks
-
Number of blocks arranged for backup
PendingDeletionBlocks
-
Number of blocks waiting to be deleted
ExcessBlocks
-
Number of excess blocks
PostponedMisreplicatedBlocks
-
Number of abnormal blocks postponed to be processed
Block capacity
BlockCapacity
-
Block capacity
Cluster data node
NumLiveDataNodes
-
Number of live data nodes
NumDeadDataNodes
-
Number of data nodes marked as dead
NumDecomLiveDataNodes
-
Number of decommissioned live nodes
NumDecomDeadDataNodes
-
Number of decommissioned dead nodes
NumDecommissioningDataNodes
-
Number of decommissioning nodes
NumStaleDataNodes
-
Number of DataNodes marked as stale
HDFS storage space utilization
CapacityUsedRate
-
HDFS cluster storage space utilization
Snapshots
Snapshots
-
Number of snapshots
Disk failure
VolumeFailuresTotal
-
Total number of volume failures across all DataNodes

HDFS - NameNode

Title
Metric
Unit
Description
Data traffic
ReceivedBytes
Bytes/s
Data receiving rate
SentBytes
Bytes/s
Data sending rate
QPS
RpcQueueTimeNumOps
1/s
RPC call rate
Request processing latency
RpcQueueTimeAvgTime
ms
Average RPC latency
RpcProcessingTimeAvgTime
ms
Average RPC request processing time
Authentication and authorization
RpcAuthenticationFailures
1 per time
Number of RPC authentication failures
RpcAuthenticationSuccesses
2 per time
Number of RPC authentication successes
RpcAuthorizationFailures
3 per time
Number of RPC authorization failures
RpcAuthorizationSuccesses
4 per time
Number of RPC authorization successes
Current connections
NumOpenConnections
-
Number of current connections
Length of RPC processing queue
CallQueueLength
-
Length of current RPC processing queue
JVM memory
MemNonHeapUsedM
MB
Size of NonHeapMemory currently used by JVM
MemNonHeapCommittedM
MB
Size of NonHeapCommittedM configured by JVM
MemHeapUsedM
MB
Size of HeapMemory currently used by JVM
MemHeapCommittedM
MB
Committed size of JVM HeapMemory
MemHeapMaxM
MB
Size of HeapMemory configured by JVM
MemMaxM
MB
Maximum size of memory available to JVM runtime
Block reporting latency
BlockReportAvgTime
count/s
Average latency of processing DataNode blocks per second
JVM threads
ThreadsNew
-
Number of threads in NEW status
ThreadsRunnable
-
Number of threads in RUNNABLE status
ThreadsBlocked
-
Number of threads in BLOCKED status
ThreadsWaiting
-
Number of threads in WAITING status
ThreadsTimedWaiting
-
Number of threads in TIMED WAITING status
ThreadsTerminated
-
Number of threads in Terminated status
JVM logs
LogFatal
-
Number of FATAL-level logs
LogError
-
Number of ERROR-level logs
LogWarn
-
Number of WARN-level logs
LogInfo
-
Number of INFO-level logs
GC count
YGC
-
Young GC count
FGC
-
Full GC count
GC time
FGCT
s
Full GC time
GCT
s
Garbage collection time
YGCT
s
Young GC time
Memory zone proportion
S0
%
Percentage of used Survivor 0 memory
S1
%
Percentage of used Survivor 1 memory
E
%
Percentage of used Eden memory
O
%
Percentage of used Old memory
M
%
Percentage of used Metaspace memory
CCS
%
Percentage of used compressed class space memory
Storages marked as content stale
NumStaleStorages
-
Number of DataNode storages marked as content stale
Pending block-related messages for later processing on the standby NameNode
PendingDataNodeMessageCount
count/s
Number of DataNode requests queued on the standby NameNode
Missing blocks
NumberOfMissingBlocks
-
Number of missing blocks
NumberOfMissingBlocksWithReplicationFactorOne
-
Number of missing blocks (rf = 1)
Snapshot operation
AllowSnapshotOps
count/s
Number of AllowSnapshot operations executed per second
DisallowSnapshotOps
count/s
Number of DisallowSnapshot operations executed per second
CreateSnapshotOps
count/s
Number of CreateSnapshot operations executed per second
DeleteSnapshotOps
count/s
Number of DeleteSnapshot operations executed per second
ListSnapshottableDirOps
count/s
Number of ListSnapshottableDir operations executed per second
SnapshotDiffReportOps
count/s
Number of SnapshotDiffReportOps operations executed per second
RenameSnapshotOps
count/s
Number of RenameSnapshotOps operations executed per second
File operation
CreateFileOps
count/s
Number of CreateFile operations executed per second
GetListingOps
count/s
Number of GetListing operations executed per second
TotalFileOps
count/s
Number of TotalFileOps operations executed per second
DeleteFileOps
count/s
Number of DeleteFile operations executed per second
FileInfoOps
count/s
Number of FileInfo operations executed per second
GetAdditionalDatanodeOps
count/s
Number of GetAdditionalDatanode operations executed per second
CreateSymlinkOps
count/s
Number of CreateSymlink operations executed per second
GetLinkTargetOps
count/s
Number of GetLinkTarget operations executed per second
FilesInGetListingOps
count/s
Number of FilesInGetListing operations executed per second
File statistics
FilesDeleted
count
Number of deleted or renamed files and folders
FilesCreated
count
Number of created files and folders
FilesAppended
count
Number of appended files
Transaction operation
TransactionsNumOps
count/s
Number of journal transaction operations processed per second
TransactionsBatchedInSync
count/s
Number of journal transaction operations batch processed per second
Image operation
GetEditNumOps
count/s
Number of GetEditNumOps operations executed per second
GetImageNumOps
count/s
Number of GetImageNumOps operations executed per second
PutImageNumOps
count/s
Number of PutImageNumOps operations executed per second
Sync operation
SyncsNumOps
count/s
Number of journal sync operations processed per second
Block operation
BlockReceivedAndDeletedOps
count/s
Number of BlockReceivedAndDeletedOps operations executed per second
BlockOpsQueued
count/s
Number of processed DataNode block reporting operations
Cache reporting
CacheReportNumOps
count/s
Number of CacheReport operations processed per second
Block reporting
BlockReportNumQps
count/s
Number of DataNode block reporting operations processed per second
Sync operation latency
SyncsAvgTime
ms
Average latency of processing journal sync operations
Cache reporting latency
CacheReportAvgTime
ms
Average latency of cache reporting
Image operation latency
GetEditAvgTime
ms
Average latency of reading Edit files
GetImageAvgTime
ms
Average latency of reading image files
PutImageAvgTime
ms
Average latency of writing image files
Transaction operation latency
TransactionsAvgTime
ms
Average latency of processing journal transaction operations
Start time
StartTime
ms
Process start time
Active/Standby status
State
1
NameNode HA status
Active/Standby status
State
1: Active. 0: Standby
NameNode active/standby status
Threads
PeakThreadCount
-
Peak number of threads
ThreadCount
-
Number of threads
DaemonThreadCount
-
Number of backend threads
Transactions since the last checkpoint
SinceLastCheckpoint
count
Total number of transactions since the last checkpoint
Checkpoint time
LastCheckpoint
time
Time since the last checkpoint
Length of the queue waiting for file locks
LockQueueLength
count
LockQueueLength - length of the queue waiting for file locks
Average RPC time (1)
CompleteAvgTime
ms
Average latency of Complete requests
CreateAvgTime
ms
Average latency of Create requests
RenameAvgTime
ms
Average latency of Rename requests
AddBlockAvgTime
ms
Average latency of AddBlock requests
GetListingAvgTime
ms
Average latency of GetListing requests
GetFileInfoAvgTime
ms
Average latency of GetFileInfo requests
SendHeartbeatAvgTime
ms
Average latency of SendHeartbeat requests
Average RPC time (2)
RegisterDatanodeAvgTime
ms
Average latency of RegisterDatanode requests
BlockReportAvgTime
ms
Average latency of BlockReport requests
DeleteAvgTime
ms
Average latency of Delete requests
RenewLeaseAvgTime
ms
Average latency of RenewLease requests
BlockReceivedAndDeletedAvgTime
ms
Average latency of BlockReceivedAndDeleted requests
FsyncAvgTime
ms
Average latency of fsync requests
VersionRequestAvgTime
ms
Average latency of VersionRequest requests
Average RPC time (3)
ListEncryptionZonesAvgTime
ms
Average latency of ListEncryptionZones requests
SetPermissionAvgTime
ms
Average latency of SetPermission requests
SetTimesAvgTime
ms
Average latency of SetTimes requests
SetSafeModeAvgTime
ms
Average latency of SetSafeMode requests
MkdirsAvgTime
ms
Average latency of Mkdirs requests
GetServerDefaultsAvgTime
ms
Average latency of GetServerDefaults requests
GetBlockLocationsAvgTime
ms
Average latency of GetBlockLocations requests
RPC statistics (1)
CompleteNumOps
count/s
Number of Complete calls per second
CreateNumOps
count/s
Number of Create calls per second
RenameNumOps
count/s
Number of Rename calls per second
AddBlockNumOps
count/s
Number of AddBlock calls per second
GetListingNumOps
count/s
Number of GetListing calls per second
GetFileInfoNumOps
count/s
Number of GetFileInfo calls per second
SendHeartbeatNumOps
count/s
Number of SendHeartbeat calls per second
RPC statistics (2)
RegisterDatanodeNumOps
count/s
Number of RegisterDatanode calls per second
BlockReportNumOps
count/s
Number of BlockReport calls per second
DeleteNumOps
count/s
Number of Delete calls per second
RenewLeaseNumOps
count/s
Number of RenewLease calls per second
BlockReceivedAndDeletedNumOps
count/s
Number of BlockReceivedAndDeleted calls per second
FsyncNumOps
count/s
Number of fsync calls per second
VersionRequestNumOps
count/s
Number of VersionRequest calls per second
RPC statistics (3)
ListEncryptionZonesNumOps
count/s
Number of ListEncryptionZones calls per second
SetPermissionNumOps
count/s
Number of SetPermission calls per second
SetTimesNumOps
count/s
Number of SetTimes calls per second
SetSafeModeNumOps
count/s
Number of SetSafeMode calls per second
MkdirsNumOps
count/s
Number of Mkdirs calls per second
GetServerDefaultsNumOps
count/s
Number of GetServerDefaults calls per second
GetBlockLocationsNumOps
count/s
Number of GetBlockLocations calls per second

HDFS - DataNode

Title
Metric
Unit
Description
Xceivers
XceiverCount
-
Number of Xceivers
Data read/write rate
BytesWrittenMB
Bytes/s
DataNode byte write rate
BytesReadMB
Bytes/s
DataNode byte read rate
RemoteBytesReadMB
Bytes/s
Remote client byte read rate
RemoteBytesWrittenMB
Bytes/s
Remote client byte write rate
Client connections
WritesFromRemoteClient
-
Remote client write QPS
WritesFromLocalClient
-
Local client write QPS
ReadsFromRemoteClient
-
Remote client read QPS
ReadsFromLocalClient
-
Local client read QPS
Block verification failure
BlockVerificationFailures
count/s
Number of block verification failures
Disk failure
VolumeFailures
count/s
Number of disk failures
Network error
DatanodeNetworkErrors
count/s
Network error statistics
Heartbeat latency
HeartbeatsAvgTime
ms
Average heartbeat time
Heartbeat QPS
HeartbeatsNumOps
count/s
Heartbeat QPS
Packet transfer RT
SendDataPacketTransferNanosAvgTime
ms
Average time of sending packets
Block operation
ReadBlockOpNumOps
count/s
Block read OPS from DataNode
WriteBlockOpNumOps
count/s
Block write OPS to DataNode
BlockChecksumOpNumOps
count/s
Checksum OPS by DataNode
CopyBlockOpNumOps
count/s
Block copying OPS
ReplaceBlockOpNumOps
count/s
Block replacement OPS
BlockReportsNumOps
count/s
Block reporting OPS
IncrementalBlockReportsNumOps
count/s
Incremental block reporting OPS
CacheReportsNumOps
count/s
Cache reporting OPS
PacketAckRoundTripTimeNanosNumOps
count/s
Number of ACK round trips processed per second
Fsync operation
FsyncNanosNumOps
count/s
Number of fsync operations processed per second
Flush operation
FlushNanosNumOps
count/s
Number of flush operations processed per second
Block operation latency statistics
ReadBlockOpAvgTime
ms
Average block read time
WriteBlockOpAvgTime
ms
Average block write time
BlockChecksumOpAvgTime
ms
Average block check time
CopyBlockOpAvgTime
ms
Average block copy time
ReplaceBlockOpAvgTime
ms
Average block replacement time
BlockReportsAvgTime
ms
Average block reporting time
IncrementalBlockReportsAvgTime
ms
Average time of incremental block reporting
CacheReportsAvgTime
ms
Average time of cache reporting
PacketAckRoundTripTimeNanosAvgTime
ms
Average time of ACK round trip processing
Flush latency
FlushNanosAvgTime
ms
Average flush time
Fsync latency
FsyncNanosAvgTime
ms
Average fsync time
RamDisk Blocks
RamDiskBlocksWrite
blocks/s
Total number of blocks written to memory
RamDiskBlocksWriteFallback
blocks/s
Total number of blocks failed to be written to memory (failover to disk)
RamDiskBlocksDeletedBeforeLazyPersisted
blocks/s
Total number of blocks deleted before the application is saved to the disk
RamDiskBlocksReadHits
blocks/s
Number of blocks read from memory
RamDiskBlocksEvicted
blocks/s
Total number of blocks cleared in memory
RamDiskBlocksEvictedWithoutRead
blocks/s
Total number of blocks retrieved from memory
RamDiskBlocksLazyPersisted
blocks/s
Number of disk writes by lazy writer
RamDiskBytesLazyPersisted
Bytes/s
Total number of bytes written to disk by lazy writer
RamDisk write speed
RamDiskBytesWrite
Bytes/s
Total number of bytes written to memory
JVM memory
MemNonHeapUsedM
MB
Size of NonHeapMemory currently used by JVM
MemNonHeapCommittedM
MB
Size of NonHeapCommittedM configured by JVM
MemHeapUsedM
MB
Size of HeapMemory currently used by JVM
MemHeapCommittedM
MB
Committed size of JVM HeapMemory
MemHeapMaxM
MB
Size of HeapMemory configured by JVM
MemMaxM
MB
Maximum size of memory available to JVM runtime
JVM threads
ThreadsNew
-
Number of threads in NEW status
ThreadsRunnable
-
Number of threads in RUNNABLE status
ThreadsBlocked
-
Number of threads in BLOCKED status
ThreadsWaiting
-
Number of threads in WAITING status
ThreadsTimedWaiting
-
Number of threads in TIMED WAITING status
ThreadsTerminated
-
Number of threads in Terminated status
JVM logs
LogFatal
-
Number of Fatal logs
LogError
-
Number of Error logs
LogWarn
-
Number of Warn logs
LogInfo
-
Number of Info logs
GC count
YGC
-
Young GC count
FGC
-
Full GC count
GC time
FGCT
s
Full GC time
GCT
s
Garbage collection time
YGCT
s
Young GC time
Memory zone proportion
S0
%
Percentage of used Survivor 0 memory
E
%
Percentage of used Eden memory
CCS
%
Percentage of used compressed class space memory
S1
%
Percentage of used Survivor 1 memory
O
%
Percentage of used Old memory
M
%
Percentage of used Metaspace memory
Data traffic
ReceivedBytes
Bytes/s
Data receiving rate
SentBytes
Bytes/s
Data sending rate
QPS
RpcQueueTimeNumOps
count/s
RPC call rate
Request processing latency
RpcQueueTimeAvgTime
ms
Average RPC latency
RpcProcessingTimeAvgTime
count/s
Average RPC request processing time
Authentication and authorization
RpcAuthenticationFailures
count/s
Number of RPC authentication failures
RpcAuthenticationSuccesses
count/s
Number of RPC authentication successes
RpcAuthorizationFailures
count/s
Number of RPC authorization failures
RpcAuthorizationSuccesses
count/s
Number of RPC authorization successes
Current connections
NumOpenConnections
-
Number of current connections
Length of RPC processing queue
CallQueueLength
1
Length of current RPC processing queue
CPU time
CurrentThreadSystemTime
ms
System time
CurrentThreadUserTime
ms
User time
Start time
StartTime
s
Process start time
Threads
PeckThreadCount
-
Peak number of threads
DaemonThreadCount
-
Number of backend threads
Read/Write latency
write
ms
Write time
read
ms
Read time
Packet transfer QPS
DataPacketOps
count/s
Packet transfer QPS
Blocks
Related to disk information, such as `/data/qcloud/data/hdfs`
-
Blocks
Used disk capacity
Related to disk information, such as `/data/qcloud/data/hdfs`
GB
Used disk capacity
Free disk capacity
Related to disk information, such as `/data/qcloud/data/hdfs`
GB
Free disk capacity
Reserved disk capacity
Related to disk information, such as `/data/qcloud/data/hdfs`
GB
Reserved disk capacity

HDFS - JournalNode

Title
Metric
Unit
Description
JVM memory
MemNonHeapUsedM
MB
Size of NonHeapMemory currently used by JVM
MemNonHeapCommittedM
MB
Size of NonHeapCommittedM configured by JVM
MemHeapUsedM
MB
Size of HeapMemory currently used by JVM
MemHeapCommittedM
MB
Committed size of JVM HeapMemory
MemHeapMaxM
MB
Size of HeapMemory configured by JVM
MemMaxM
MB
Maximum size of memory available to JVM runtime
JVM threads
ThreadsNew
-
Number of threads in NEW status
ThreadsRunnable
-
Number of threads in RUNNABLE status
ThreadsBlocked
-
Number of threads in BLOCKED status
ThreadsWaiting
-
Number of threads in WAITING status
ThreadsTimedWaiting
-
Number of threads in TIMED WAITING status
ThreadsTerminated
-
Number of threads in Terminated status
JVM logs
LogFatal
-
Number of FATAL-level logs
LogError
-
Number of ERROR-level logs
LogWarn
-
Number of WARN-level logs
LogInfo
-
Number of INFO-level logs
GC count
YGC
-
Young GC count
FGC
-
Full GC count
GC time
FGCT
s
Full GC time
GCT
s
Garbage collection time
YGCT
s
Young GC time
Memory zone proportion
S0
%
Percentage of used Survivor 0 memory
E
%
Percentage of used Eden memory
CCS
%
Percentage of used compressed class space memory
S1
%
Percentage of used Survivor 1 memory
O
%
Percentage of used Old memory
M
%
Percentage of used Metaspace memory
Data traffic
ReceivedBytes
Bytes/s
Data receiving rate
SentBytes
Bytes/s
Data sending rate
Request processing latency
RpcQueueTimeAvgTime
ms
Average RPC latency
Authentication and authorization
RpcAuthenticationFailures
count/s
Number of RPC authentication failures
RpcAuthenticationSuccesses
count/s
Number of RPC authentication successes
RpcAuthorizationFailures
count/s
Number of RPC authorization failures
RpcAuthorizationSuccesses
count/s
Number of RPC authorization successes
Current connections
NumOpenConnections
-
Number of current connections
Length of RPC processing queue
CallQueueLength
1
Length of current RPC processing queue
CPU time
CurrentThreadSystemTime
ms
System time
CurrentThreadUserTime
ms
User time
Start time
StartTime
s
Process start time
Threads
PeckThreadCount
-
Peak number of threads
DaemonThreadCount
-
Number of backend threads

HDFS - ZKFC

Title
Metric
Unit
Description
GC count
YGC
-
Young GC count
FGC
-
Full GC count
GC time
FGCT
s
Full GC time
GCT
s
Garbage collection time
YGCT
s
Young GC time
Memory zone proportion
S0
%
Percentage of used Survivor 0 memory
E
%
Percentage of used Eden memory
CCS
%
Percentage of used compressed class space memory
S1
%
Percentage of used Survivor 1 memory
O
%
Percentage of used Old memory
M
%
Percentage of used Metaspace memory

HDFS-Router

Title
Metric
Unit
Description
ALTER TABLE request duration
HIVE.HMS.API_ALTER_TABLE
ms
Average duration of ALTER TABLE requests
ALTER TABLE WITH ENV CONTEXT request duration
HIVE.HMS.API_ALTER_TABLE_WITH_ENV_CONTEXT
ms
Average duration of ALTER TABLE WITH ENV CONTEXT requests
CREATE TABLE request duration
HIVE.HMS.API_CREATE_TABLE
ms
Average duration of CREATE TABLE requests
CREATE TABLE WITH ENV CONTEXT request duration
HIVE.HMS.API_CREATE_TABLE_WITH_ENV_CONTEXT
ms
Average duration of CREATE TABLE WITH ENV CONTEXT requests
DROP TABLE request duration
HIVE.HMS.API_DROP_TABLE
ms
Average duration of DROP TABLE requests
DROP TABLE WITH ENV CONTEXT request duration
HIVE.HMS.API_DROP_TABLE_WITH_ENV_CONTEXT
ms
Average duration of DROP TABLE WITH ENV CONTEXT requests
GET TABLE request duration
HIVE.HMS.API_GET_TABLE
ms
Average duration of GET TABLE requests
GET TABLES request duration
HIVE.HMS.API_GET_TABLES
ms
Average duration of GET TABLES requests
GET MULTI TABLE request duration
HIVE.HMS.API_GET_MULTI_TABLE
ms
Average duration of GET MULTI TABLE requests
GET TABLE REQ request duration
HIVE.HMS.API_GET_TABLE_REQ
ms
Average duration of GET TABLE REQ requests
GET DATABASE request duration
HIVE.HMS.API_GET_DATABASE
ms
Average duration of GET DATABASE requests
GET DATABASES request duration
HIVE.HMS.API_GET_DATABASES
ms
Average duration of GET DATABASES requests
GET ALL DATABASES request duration
HIVE.HMS.API_GET_ALL_DATABASES
ms
Average duration of GET ALL DATABASES requests
GET ALL FUNCTIONS request duration
HIVE.HMS.API_GET_ALL_FUNCTIONS
ms
Average duration of GET ALL FUNCTIONS requests
Current number of active CREATE TABLE requests
HIVE.HMS.ACTIVE_CALLS_API_CREATE_TABLE
-
Current number of active CREATE TABLE requests
Current number of active DROP TABLE requests
HIVE.HMS.ACTIVE_CALLS_API_DROP_TABLE
-
Current number of active DROP TABLE requests
Current number of active ALTER TABLE requests
HIVE.HMS.ACTIVE_CALLS_API_ALTER_TABLE
-
Current number of active ALTER TABLE requests

Was this page helpful?
You can also Contact Sales or Submit a Ticket for help.
Yes
No

Feedback

Contact Us

Contact our sales team or business advisors to help your business.

Technical Support

Open a ticket if you're looking for further assistance. Our Ticket is 7x24 available.

7x24 Phone Support
Hong Kong, China
+852 800 906 020 (Toll Free)
United States
+1 844 606 0804 (Toll Free)
United Kingdom
+44 808 196 4551 (Toll Free)
Canada
+1 888 605 7930 (Toll Free)
Australia
+61 1300 986 386 (Toll Free)
EdgeOne hotline
+852 300 80699
More local hotlines coming soon