The SKE subnet mask is different from the Kubernetes cluster subnet mask, causing some IP addresses to be unreachable
Description
When a cluster has some nodes deployed normally while others face network issues, and based on experience, it is suspected that the problem might be due to the subnet being inaccessible. The cause could be differing subnet masks between the SKE (Software-Defined Kernel Edge) and the Kubernetes cluster.
The scenario presents as follows: If SKE was configured with an internal communication interface subnet mask of 255.255.255.0, but during regular use, it was mistakenly assumed that the subnet mask was 255.255.240.0, which is commonly used with classic network settings or VPC (Virtual Private Cloud) network configurations for Elastic IP pools. Initially, there may be no issues as nodes are started within the same segment. However, once the IP addresses used exceed the range defined by the subnet mask of 255.255.255.0, nodes deployed beyond that range will experience network connectivity issues. Given that previous nodes were functioning normally and it is not easily apparent that the SKE internal communication interface has a subnet mask of 255.255.255.0, the problem can be difficult to identify.
Alert Information
Some nodes in the cluster are not being deployed successfully.
Effective Troubleshooting Steps
This problem could be due to the SKE internal communication interface subnet mask and the Kubernetes cluster subnet mask being inconsistent.
To check the SKE subnet mask:
Method 1: Obtain it from the SKE's configmap relative-info.
Method 2: Obtain it from the network interface of the border router connected to the SKE internal communication interface.
Check if the IP configuration range of SKE is in the same subnet range as that of the Kubernetes cluster. If they are not, this will lead to network unavailability.
For instance, here, 211.188.55.35 and Kubernetes's 211.188.55.37 are in the same subnet and can communicate with each other, but 211.188.56.38 and SKE's internal communication interface 211.188.55.35 are in different subnets.
Root Cause
The SKE subnet range is smaller than the Kubernetes cluster's IP subnet range. Initially deployed nodes may be working fine, but once the range is exceeded, deployment failures can occur due to network issues.
Solution
Modify the IP range to be within the same subnet as SKE. Alternatively, contact technical support to change the subnet mask of the SKE border router.
Impact Scope
Kubernetes cluster deployment process.
Is This a Temporary Solution?
Yes, the issue is caused by improper network planning.
Recommendations and Summary
The probability of encountering this issue is low, but it is still possible due to configuration errors.
When deploying large clusters with numerous nodes, it is advisable to consider this issue if you encounter a situation where some nodes are failing to deploy due to network problems.
Troubleshooting Content
http://docs.sangfor.org/pages/viewpage.action?pageId=383451312
Original Link
https://support.sangfor.com.cn/cases/list?product_id=37&type=1&category_id=28945&isOpen=true



