TKE cluster access may fail in some cases. If you have confirmed that the backend Pod is normal, the cause may be that the kube-proxy add-on version is earlier than required, preventing iptables or IPVS forwarding rules on the node from being delivered successfully. This document describes some problems due to earlier kube-proxy versions and offers fixes. If you still have problems, contact us for assistance.
Failed to execute iptables-restore: exit status 2 (iptables-restore v1.8.4 (legacy): Couldn't load target 'KUBE-MARK-DROP':No such file or directory
iptables-restore
is executed in kube-proxy, the dependent KUBE-MARK-DROP
chain doesn't exist, leading to the rule sync failure and exit. The KUBE-MARK-DROP
chain is maintained by kubelet.KUBE-MARK-DROP
chain cannot be read. Later OS versions include:Upgrade kube-proxy. Below is the sample logic:
TKE Cluster Version | Fix Policy |
---|---|
> 1.18 | No fixes are required, as the problem doesn't exist. |
1.18 | Upgrade kube-proxy to v1.18.4-tke.26 or later. |
1.16 | Upgrade kube-proxy to v1.16.3-tke.28 or later. |
1.14 | Upgrade kube-proxy to v1.14.3-tke.27 or later. |
1.12 | Upgrade kube-proxy to v1.12.4-tke.31 or later. |
1.10 | Upgrade kube-proxy to v1.10.5-tke.20 or later. |
Note:For more information on the latest TKE versions, see TKE Kubernetes Revision Version History.
Failed to execute iptables-restore: exit status 1 (iptables-restore: line xxx failed)
iptables-restore
) will use a file lock for sync to avoid concurrent writes of multiple instances. On Linux, the file is generally /run/xtables.lock
.For a Pod that needs to call iptables commands, you need to mount the host /run/xtables.lock
file to the Pod as follows:
volumeMounts:
- mountPath: /run/xtables.lock
name: xtables-lock
readOnly: false
volumes:
- hOStPath:
path: /run/xtables.lock
type: FileOrCreate
name: xtables-lock
Failed to execute iptables-restore: exit status 4 (Another app is currently holding the xtables lock. Perhaps you want to use the -w option?)
iptables-restore
) will use a file lock for sync to avoid concurrent writes of multiple instances. When iptables-restore
is executed, it tries getting a file lock or exits if the lock is held by another process.iptables-restore
on later versions provide a -w(--wait)
option. If -w=5
, iptables-restore
will be blocked for five seconds when getting the lock. If another process releases the lock during this period, iptables-restore
can continue its operation.iptables-restore
by upgrading the node OS. Below is the sample logic:Node OS | Target Version |
---|---|
CentOS | 7.2 or later |
Ubuntu | 20.04 or later |
Tencent Linux | 2.4 or later |
iptables-restore
by upgrading kube-proxy. Below is the sample logic:TKE Cluster Version | Fix Policy |
---|---|
> 1.12 | No fixes are required, as the problem doesn't exist. |
1.12 | Upgrade kube-proxy to v1.12.4-tke.31 or later. |
< 1.12 | Upgrade the TKE cluster. |
Note:For more information on the latest TKE versions, see TKE Kubernetes Revision Version History.
Failed to ensure that filter chain KUBE-SERVICES exists: error creating chain "KUBE-EXTERNAL-SERVICES": exit status 4: Another app is currently holding the xtables lock. Stopped waiting after 5s.
iptables-restore
) will use a file lock for sync to avoid concurrent writes of multiple instances. When iptables-restore
is executed, it tries getting a file lock. If the lock is held by another process, iptables-restore
will be blocked for a certain period of time (subject to the -w
value, which is five seconds by default) before getting the lock. It will continue after getting the lock or exit.Reduce the time when other add-ons hold the iptables file lock as much as possible. In particular, the NetworkPolicy (kube-router) add-on provided on the add-on management page in the TKE console on an earlier version holds the iptables lock for a long time. You can upgrade it to the latest version v1.3.2
.
Failed to list *core.Endpoints: Stream error http2.StreamError{StreamID:0xea1, Code:0x2, Cause:error(nil)} when reading response body, may be caused by closed connection. Please retry.
There is a bug when Kubernetes on an earlier version calls the go HTTP/2 package, which causes the client to use a disabled connection of the API server. When this bug occurs in kube-proxy, rule sync will fail. For more information, see (1.17) Kubelet won't reconnect to Apiserver after NIC failure (use of closed network connection) #87615 and Enables HTTP/2 health check #95981.
Upgrade kube-proxy. Below is the sample logic:
TKE Cluster Version | Fix Policy |
---|---|
> 1.18 | No fixes are required, as the problem doesn't exist. |
1.18 | Upgrade kube-proxy to v1.18.4-tke.26 or later. |
< 1.18 | Upgrade the TKE cluster. |
Note:For more information on the latest TKE versions, see TKE Kubernetes Revision Version History.
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x50 pc=0x1514fb8]
Upgrade kube-proxy. Below is the sample logic:
TKE Cluster Version | Fix Policy |
---|---|
> 1.18 | No fixes are required, as the problem doesn't exist. |
1.18 | Upgrade kube-proxy to v1.18.4-tke.26 or later. |
< 1.18 | No fixes are required, as the problem doesn't exist. |
Note:For more information on the latest TKE versions, see TKE Kubernetes Revision Version History.
Observed a panic: "slice bounds out of range" (runtime error: slice bounds out of range)
There is a bug in the community code of kube-proxy. When iptables-save
is executed, the standard output and standard error are targeted at the same buffer, and the sequence of the two is uncertain, leading to an unexpected data format in the buffer and thereby a panic during processing. For more information, see kube-proxy panics when parsing iptables-save output #78443 and Fix panic in kube-proxy when iptables-save prints to stderr #78428.
Upgrade kube-proxy. Below is the sample logic:
TKE Cluster Version | Fix Policy |
---|---|
> 1.14 | No fixes are required, as the problem doesn't exist. |
1.14 | Upgrade kube-proxy to v1.14.3-tke.27 or later. |
1.12 | Upgrade kube-proxy to v1.12.4-tke.31 or later. |
< 1.12 | No fixes are required, as the problem doesn't exist. |
Note:For more information on the latest TKE versions, see TKE Kubernetes Revision Version History.
This is because kube-proxy frequently refreshes the node Service forwarding rules, specifically:
If the problem is caused by frequent periodic rule syncs by kube-proxy, you need to modify relevant parameters. Below are default parameters of kube-proxy on an earlier version:
--ipvs-min-sync-period=1s (minimum refresh interval of one second)
--ipvs-sync-period=5s (periodic refresh every five seconds)
Therefore, kube-proxy refreshes the node iptables rules once every five seconds, consuming many CPU resources. You can change the configuration to:
--ipvs-min-sync-period=0s (real-time refresh upon event occurrence)
--ipvs-sync-period=30s (periodic refresh every 30 seconds)
The above configured values are default values and can be configured as needed.
Was this page helpful?