This document describes the common causes and troubleshooting procedures of CVM network packet loss.
Common Causes
The common causes of CVM network packet loss are as follows:
Prerequisites
Troubleshooting
TCP packet loss due to the limit setting
Tencent Cloud provides various types of CVM instances, each of which has different network performance. When the maximum bandwidth or packet size of an instance is reached, packet loss may occur. The troubleshooting procedure is as follows:
1. Check the bandwidth and packet volume of the instance.
For a Linux instance, run the sar -n DEV 2
command to check its bandwidth and packets. The rxpck/s
and txpck/s
metrics indicate the packets received and sent, respectively. The rxkB/s
and txkB/s
metrics indicate the inbound and outbound bandwidth, respectively.
2. Compare the result with the performance indicator shown in the instance type and check if the upper limit is reached. If yes, upgrade the instance or adjust your business volume.
UDP packet loss due to the limit setting
If yes, upgrade the instance or adjust your business volume.
If no, the cause may be the frequency limit on DNS requests. After the overall bandwidth or packets hit the performance bottleneck of the instance, the DNS request speed may be limited, which causes packet loss. In this case, please submit a ticket for assistance. Packet loss due to soft interrupt
When the operating system detects that the second value of the /proc/net/softnet_stat
statistics is increasing, a soft interrupt causes the packet loss. The troubleshooting procedure is as follows:
Check whether the RPS (Receive Packet Steering) is enabled:
If no, check whether the CPU high single-core soft interrupt causes the delayed data receiving and sending. In this case, you can:
Choose to enable RPS to make soft interrupt distribution more balanced.
Check whether the business program will cause uneven distribution of soft interrupts.
Full UDP send buffer
If your instance lost packets due to insufficient UDP buffer, the troubleshooting procedure is as follows:
1. Run the ss -nump
command to check whether the UDP send buffer is full.
2. If the buffer is full, increase the values of the kernel parameters net.core.wmem_max
and net.core.wmem_default
, and restart the UDP program for the configuration to take effect. For more information about kernel parameters, see Introduction to Linux Kernel Parameters. 3. If the packet loss problem persists, run the ss -nump
command, and you will find that the send buffer size does not increase as expected. In this case, check whether SO_SNDBUF
is configured through the setsockopt
function in the code. If so, modify the code to increase the value of SO_SNDBUF
.
Full UDP receive buffer
If your instance lost packets due to insufficient UDP buffer, the troubleshooting procedure is as follows:
1. Run the ss -nump
command to check whether the UDP receive buffer is full.
2. If the buffer is full, increase the values of the kernel parameters net.core.rmem_max
and net.core.rmem_default
, and restart the UDP program for the configuration to take effect. For more information about kernel parameters, see Introduction to Linux Kernel Parameters. 3. If the packet loss problem persists, run the ss -nump
command, and you will find that the receive buffer size does not increase as expected. In this case, check whether SO_RCVBUF
is configured through the setsockopt
function in the code. If so, modify the code to increase the value of SO_RCVBUF
.
Full TCP accept queue
The TCP accept queue length is the net.core.somaxconn
value or the passed-in backlog
value when a business process calls the listen system, whichever is smaller. If your instance lost packets due to full TCP accept queue, the troubleshooting procedure is as follows:
2. Check whether the business process passes in the backlog
parameter, and increase its value accordingly.
TCP request overflow
If you lock the socket when TCP receives data, the data will be sent to the backlog queue. If the process fails, packet loss occurs due to the TCP request overflow. Assume the business program performs well, troubleshoot the packet loss problem at the system level.
Check whether the business program sets the buffer size through the setsockopt
function.
If yes, modify the business program to specify a larger value or abandon the setting.
Note:
The setsockopt
value is restricted by the kernel parameters net.core.rmem_max
and net.core.wmem_max
. You can also adjust the values of the two kernel parameters, and then restart the business program for the configuration to take effect.
If not, increase the respective values of the kernel parameters net.ipv4.tcp_mem
, net.ipv4.tcp_rmem
and net.ipv4.tcp_wmem
to heighten the socket level.
For kernel parameter modifications, see Introduction to Linux Kernel Parameters. Connections exceeding the upper limit
Tencent Cloud provides various types of CVM instances. Each type has unique connection performance. When instance connections exceed the specified threshold, no connection is allowed, resulting in packet loss. The troubleshooting procedure is as follows:
Note:
The connection refers to the number of CVM instance sessions (including TCP, UDP, and ICMP sessions) saved on a host. If the value is greater than the network connections obtained by using the ss
or netstat
command on the instance, the threshold is exceeded.
Compare the network connections on your instance with the number of connections shown in the instance type and check if the upper limit is reached. If yes, upgrade the instance or adjust your business volume.
iptables policy rules
If no relevant rules are set in the iptables of the CVM, the problem may be due to the settings of the iptables policy rules that drop all packets arriving at the CVM. The troubleshooting procedure is as follows:
1. Run the following command to view the iptables policy rules.
iptables -L | grep policy
The iptables policy rule defaults to ACCEPT
. If the INPUT chain policy is not ACCEPT
, all packets to the CVM will be dropped. For example, if the following result is returned, all packets arriving at the CVM will be dropped.
Chain INPUT (policy DROP)
Chain FORWARD (policy ACCEPT)
Chain OUTPUT (policy ACCEPT)
2. Run the following command to adjust the value after -P
as needed.
After adjustment, run the command in step 1 again to check, and the following result should be returned: Chain INPUT (policy ACCEPT)
Chain FORWARD (policy ACCEPT)
Chain OUTPUT (policy ACCEPT)
Was this page helpful?