This document describes common network issues that may occur in various scenarios in TKE clusters and provides troubleshooting methods. When encountering such issues, you are advised to follow the inspection suggestions below to perform troubleshooting. If you still cannot access networks normally after confirming that the inspection items are correct, contact us for help. Inaccessibility Between Containers (Pods) on Different Nodes in a Cluster
Pods on different nodes in the same cluster can directly access each other. If a pod on a node cannot access a pod on another node, you are advised to perform the following checks:
1. Check whether the above nodes can access each other.
2. Check whether the node security group correctly allows the container network segment and the VPC network segment or VPC subnet segment where the peer node is located.
Inaccessibility Between a Node and a Container (Pod) in the Same VPC
A node and a pod in the same VPC can directly access each other. If an inaccessibility issue occurs, you are advised to perform the following checks:
1. Check whether the peer node and the node where the pod is located can access each other.
2. Check whether the security group of the node where the pod is located correctly allows the VPC subnet segment where the peer node is located.
3. Check whether the security group of the peer node correctly allows the container segment.
Inaccessibility Between a Node and a Container (Pod) or Between Containers (Pods) in Different VPCs
The mutual access between different VPCs must be completed through Cloud Connect Network or Peering Connection. If inaccessibility issues persist after the connection is established, you are advised to perform the following checks: 1. Check whether the nodes can access each other.
2. Check whether the security group of the peer node correctly allows the VPC network segment and container network segment.
3. Check whether the security group of the node where the pod is located correctly allows the VPC network segment or VPC subnet segment of the peer node.
If containers (pods) cannot access each other, perform the following checks:
1. Check whether the security group of the node where the pod is located correctly allows the peer VPC network segment (or the VPC subnet segment where the node is located) and the container network segment.
2. To view the source IP address of the pod, run the following command to modify the configuration of ip_masq_agent
and add the container network segment of each other.
kubectl -n kube-system edit configmap ip-masq-agent-config
Inaccessibility Between an IDC and a Container (Pod)
The mutual access between an IDC and a pod must be completed through Cloud Connect Network or Direct Connect Gateway. If inaccessibility issues persist after the connection is established, you are advised to perform the following checks: Check whether the IDC firewall allows the container network segment and CVM network segment.
Check whether the CVM security group allows the IDC network segment.
Check whether the IDC uses the BGP protocol:
If the BGP protocol is not used, you need to configure the next-hop route to the direct connect gateway in the IDC for accessing the container network segment.
If the BGP protocol is used, automatic synchronization will be performed and typically no configuration is required. If the IDC has special static configurations, you can contact the O&M personnel to configure the next-hop to the direct connect gateway for accessing the container network segment.
Note
To view the IP address of a pod in an IDC, you need to allow the IDC network segment.
By default, the access to packets outside of VPCs will be converted to NodeIP through SNAT processing. When allowing an IDC network segment, you need to implement the configuration of bypassing SNAT processing.
The method of allowing an IDC network segment is as follows: Run the kubectl -n kube-system edit configmap ip-masq-agent-config
command, modify the ip-masq-agent configuration, and add the IDC network segment to the NonMasqueradeCIDRs list.
Related Operations
Viewing iptables or IPVS Forwarding Rules on a Node
You can run the following commands to view the iptables or IPVS forwarding rules on a node.
Run the following command to view the iptables forwarding rules.
Run the following command to view the IPVS forwarding rules.
ipvsadm -Ln -t/-u ip:port
Running Packet Capture Commands
The following packet capture commands can be used to analyze situations where containers (pods) on different nodes in a cluster cannot access each other.
Run the following command to capture packets of the NIC eth0 on the node of the source pod.
tcpdump -nn -vv -i eth0 host <IP address of the peer pod>
Run the following command to capture packets of the NIC eth0 on the node of the peer pod.
tcpdump -nn -vv -i eth0 host <IP address of the source pod>
Run the following command to capture packets of the NIC eth0 in the netns of the peer pod.
tcpdump -nn -vv -i eth0 port <Requested port number>
Locating Network Issues by Capturing Packets in a Container
When running applications by using Kubernetes, you may encounter some network issues, the most common of which are server unresponsiveness (timeout) and abnormal packet return content. If you cannot locate an issue in the related configurations, you need to check whether the data packets are ultimately routed to the container, or whether the content of the packets arriving at and leaving the container aligns with expectations, and further narrow down the issue scope by analyzing the packets. This topic provides a script that allows one-click access to the container network namespace (netns) and uses tcpdump on the host for packet capturing.
Using a Script to Access the Pod netns in One-Click Mode for Packet Capturing
If a service cannot be accessed, you are advised to set the number of replicas to 1 and perform the following steps to capture packets.
1. Run the following command to obtain the node where the replica is located and the pod name.
2. Log in to the node where the pod is located, and paste the following script into the Shell to register the function to the currently logged-in Shell.
function e() {
set -eu
ns=${2-"default"}
pod=`kubectl -n $ns describe pod $1 | grep -A10 "^Containers:" | grep -Eo 'docker://.*$' | head -n 1 | sed 's/docker:\\/\\/\\(.*\\)$/\\1/'`
pid=`docker inspect -f {{.State.Pid}} $pod`
echo "entering pod netns for $ns/$1"
cmd="nsenter -n --target $pid"
echo $cmd
$cmd
}
3. Run the following command to enter the netns where the pod is located in one-click mode.
An example is as follows:
e istio-galley-58c7c7c646-m6568 istio-system
e proxy-5546768954-9rxg6
Note
After entering the netns of the pod, you can run the ip a
or ifconfig
command on the host to view the NIC of the container and run the netstat -tunlp
command to view the listening port of the current container.
4. Run the following command to capture packets using tcpdump.
tcpdump -i eth0 -w test.pcap port 80
Analyzing Packets by Using Wireshark
You can stop packet capturing by running the Ctrl+C
command, and then run the scp
or sz
command to download the captured packets for analysis by using wireshark
. During the analysis process, you may use the following common wireshark
filtering syntax:
Establish a Telnet connection and send test text, such as "lbtest"
. Run the following command to check whether the sent test packet is delivered to the container.
If the container offers the HTTP service, you can use curl to send test path requests.
Run the following command to filter the URI, and check whether the packet is delivered to the container.
http.request.uri=="/mytest"
Script Principles
View the container ID for which the specified pod runs.
kubectl describe pod <pod> -n mservice
Obtain the PID of the container process.
docker inspect -f {{.State.Pid}} <container>
Enter the network namespace of the container.
nsenter -n --target <PID>
The host names that the above script depends on include: kubectl, docker, nsenter, grep, head, and sed.
Viewing the Node Security Group Configuration
1. Log in to the TKE console and choose Cluster in the navigation bar. 2. Click the cluster ID to go to the cluster details page.
3. In the navigation bar, choose Node management > Node.
4. On the node list page, click the ID of the node for which you want to view the security group.
5. On the node management page, click the Details tab and click the node ID under Node information.
6. On the basic information page of the node, click the Security group tab, and check whether the security group of the node correctly allows the port range of 30000 to 32768.
Was this page helpful?