ipv6在k8s中的支持情况
ipv6特性在k8s中作为一个特定的feature IPv6DualStack来管理。在1.15版本到1.20版本该feature为alpha版本,默认不开启。从1.21版本开始,该feature为beta版本,默认开启,支持pod和service网络的双栈。1.23版本变为稳定版本。
配置双栈
alpha版本需要通过kube-apiserver、kube-controller-manager、kubelet和kube-proxy的–feature-gates=”IPv6DualStack=true”命令来开启该特性,beta版本该特性默认开启。
kube-apiserver:
--service-cluster-ip-range=<IPv4 CIDR>,<IPv6 CIDR>
kube-controller-manager:
--cluster-cidr=<IPv4 CIDR>,<IPv6 CIDR>
--service-cluster-ip-range=<IPv4 CIDR>,<IPv6 CIDR>
--node-cidr-mask-size-ipv4|--node-cidr-mask-size-ipv6
对于 IPv4 默认为 /24,对于 IPv6 默认为 /64
kube-proxy:
--cluster-cidr=<IPv4 CIDR>,<IPv6 CIDR>
使用kind创建k8s集群
由于ipv6的环境并不容易找到,可以使用kind快速在本地拉起一个k8s的单集群环境,同时kind支持ipv6的配置。
检查当前系统是否支持ipv6,如果输出结果为0,说明当前系统支持ipv6.
1 2
| $ sysctl -a | grep net.ipv6.conf.all.disable_ipv6 net.ipv6.conf.all.disable_ipv6 = 0
|
创建ipv6单栈k8s集群执行如下命令:
1 2 3 4 5 6 7 8 9 10 11
| cat > kind-ipv6.conf <<EOF kind: Cluster apiVersion: kind.x-k8s.io/v1alpha4 name: ipv6-only networking: ipFamily: ipv6
EOF kind create cluster --config kind-ipv6.conf
|
创建ipv4/ipv6双栈集群执行如下命令:
1 2 3 4 5 6 7 8 9 10 11
| cat > kind-dual-stack.conf <<EOF kind: Cluster apiVersion: kind.x-k8s.io/v1alpha4 name: dual-stack networking: ipFamily: dual
EOF kind create cluster --config kind-dual-stack.conf
|
service网络
spec定义
k8s的service网络为了支持双栈,新增加了几个字段。
.spec.ipFamilies用来设置地址族,service一旦场景后该值不可认为修改,但可以通过修改.spec.ipFamilyPolicy来简介修改该字段的值。为数组格式,支持如下值:
- [“IPv4”]
- [“IPv6”]
- [“IPv4”,”IPv6”] (双栈)
- [“IPv6”,”IPv4”] (双栈)
上述数组中的第一个元素会决定.spec.ClusterIP中的字段。
.spec.ClusterIPs:由于.spec.ClusterIP的value为字符串,不支持同时设置ipv4和ipv6两个ip地址。因此又扩展出来了一个.spec.ClusterIPs字段,该字段的value为宿主元祖。在Headless类型的Service情况下,该字段的值为None。
.spec.ClusterIP:该字段的值跟.spec.ClusterIPs中的第一个元素的值保持一致。
.spec.ipFamilyPolicy支持如下值:
- SingleStack:默认值。会使用.spec.ipFamilies数组中配置的第一个协议来分配cluster ip。如果没有指定.spec.ipFamilies,会使用service-cluster-ip-range配置中第一个cidr中来配置地址。
- PreferDualStack:如果 .spec.ipFamilies 没有设置,使用 k8s 集群默认的 ipFamily。
- RequireDualStack:同时分配ipv4和ipv6地址。.spec.ClusterIP的值会从.spec.ClusterIPs选择第一个元素。
service配置场景
先创建如下的deployment,以便于后面试验的service可以关联到pod。
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21
| apiVersion: apps/v1 kind: Deployment metadata: name: nginx-deployment labels: app: MyApp spec: replicas: 3 selector: matchLabels: app: MyApp template: metadata: labels: app: MyApp spec: containers: - name: nginx image: nginx:1.14.2 ports: - containerPort: 80
|
未指定任何协议栈信息
在没有指定协议栈信息的Service,创建出来的service .spec.ipFamilyPolicy为SingleStack。同时会使用service-cluster-ip-range配置中第一个cidr中来配置地址。如果第一个cidr为ipv6,则service分配的clusterip为ipv6地址。创建如下的service
1 2 3 4 5 6 7 8 9 10 11 12
| apiVersion: v1 kind: Service metadata: name: my-service labels: app: MyApp spec: selector: app: MyApp ports: - protocol: TCP port: 80
|
会生成如下的service
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24
| apiVersion: v1 kind: Service metadata: labels: app: MyApp name: my-service namespace: default spec: clusterIP: 10.96.80.114 clusterIPs: - 10.96.80.114 ipFamilies: - IPv4 ipFamilyPolicy: SingleStack ports: - port: 80 protocol: TCP targetPort: 80 selector: app: MyApp sessionAffinity: None type: ClusterIP status: loadBalancer: {}
|
指定.spec.ipFamilyPolicy为PreferDualStack
创建如下的service
1 2 3 4 5 6 7 8 9 10 11 12 13
| apiVersion: v1 kind: Service metadata: name: my-service-2 labels: app: MyApp spec: ipFamilyPolicy: PreferDualStack selector: app: MyApp ports: - protocol: TCP port: 80
|
提交到环境后会生成如下的service,可以看到.spec.clusterIPs中的ip地址跟ipFamilies中的顺序一致。
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26
| apiVersion: v1 kind: Service metadata: labels: app: MyApp name: my-service-2 namespace: default spec: clusterIP: 10.96.221.70 clusterIPs: - 10.96.221.70 - fd00:10:96::7d1 ipFamilies: - IPv4 - IPv6 ipFamilyPolicy: PreferDualStack ports: - port: 80 protocol: TCP targetPort: 80 selector: app: MyApp sessionAffinity: None type: ClusterIP status: loadBalancer: {}
|
查看Endpoints,可以看到subsets中的地址为pod的ipv4协议地址。Endpoints中的地址跟service的.spec.ipFamilies数组中的第一个协议的值保持一致。
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36
| apiVersion: v1 kind: Endpoints metadata: labels: app: MyApp name: my-service-2 namespace: default subsets: - addresses: - ip: 10.244.0.5 nodeName: dual-stack-control-plane targetRef: kind: Pod name: nginx-deployment-65b5dd4c68-vrfps namespace: default resourceVersion: "16875" uid: 30a1d787-f799-4250-8c56-c96564ca9239 - ip: 10.244.0.6 nodeName: dual-stack-control-plane targetRef: kind: Pod name: nginx-deployment-65b5dd4c68-wgz72 namespace: default resourceVersion: "16917" uid: 8166d43e-2702-45c6-839e-b3005f44f647 - ip: 10.244.0.7 nodeName: dual-stack-control-plane targetRef: kind: Pod name: nginx-deployment-65b5dd4c68-x4lt5 namespace: default resourceVersion: "16896" uid: f9c2968f-ca59-4ba9-a69f-358c202a964b ports: - port: 80 protocol: TCP
|
接下来指定.spec.ipFamilies的顺序再看一下执行的结果,创建如下的service
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
| apiVersion: v1 kind: Service metadata: name: my-service-3 labels: app: MyApp spec: ipFamilyPolicy: PreferDualStack ipFamilies: - IPv6 - IPv4 selector: app: MyApp ports: - protocol: TCP port: 80
|
在环境中生成的service如下,可以看到.spec.clusterIPs中的顺序第一个为ipv6地址,.spec.clusterIP同样为ipv6地址。
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26
| apiVersion: v1 kind: Service metadata: labels: app: MyApp name: my-service-3 namespace: default spec: clusterIP: fd00:10:96::c306 clusterIPs: - fd00:10:96::c306 - 10.96.147.82 ipFamilies: - IPv6 - IPv4 ipFamilyPolicy: PreferDualStack ports: - port: 80 protocol: TCP targetPort: 80 selector: app: MyApp sessionAffinity: None type: ClusterIP status: loadBalancer: {}
|
查看Endpoints,可以看到subsets中的地址为pod的ipv6协议地址。Endpoints中的地址跟service的.spec.ipFamilies数组中的第一个协议的值保持一致。
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36
| apiVersion: v1 kind: Endpoints metadata: labels: app: MyApp name: my-service-3 namespace: default subsets: - addresses: - ip: fd00:10:244::5 nodeName: dual-stack-control-plane targetRef: kind: Pod name: nginx-deployment-65b5dd4c68-vrfps namespace: default resourceVersion: "16875" uid: 30a1d787-f799-4250-8c56-c96564ca9239 - ip: fd00:10:244::6 nodeName: dual-stack-control-plane targetRef: kind: Pod name: nginx-deployment-65b5dd4c68-wgz72 namespace: default resourceVersion: "16917" uid: 8166d43e-2702-45c6-839e-b3005f44f647 - ip: fd00:10:244::7 nodeName: dual-stack-control-plane targetRef: kind: Pod name: nginx-deployment-65b5dd4c68-x4lt5 namespace: default resourceVersion: "16896" uid: f9c2968f-ca59-4ba9-a69f-358c202a964b ports: - port: 80 protocol: TCP
|
单栈和双栈之间的切换
虽然.spec.ipFamilies字段不允许直接修改,但.spec.ipFamilyPolicy字段允许修改,但并不影响单栈和双栈之间的切换。
单栈变双栈只需要修改.spec.ipFamilyPolicy从SingleStack变为PreferDualStack或者RequireDualStack即可。
双栈同样可以变为单栈,只需要修改.spec.ipFamilyPolicy从PreferDualStack或者RequireDualStack变为SingleStack即可。此时.spec.ipFamilies会自动变更为一个元素,.spec.clusterIPs同样会变更为一个元素。
LoadBalancer类型的Service
对于LoadBalancer类型的Service,单栈的情况下,会在.status.loadBalancer.ingress设置vip地址。如果是ingress中的ip地址为双栈,此时应该是将双栈的vip地址同时写到.status.loadBalancer.ingress中,并且要保证其顺序跟serivce的.spec.ipFamilies中的顺序一致。
pod网络
pod网络要支持ipv6,需要容器网络插件的支持。为了支持ipv6特性,新增加了.status.podIPs字段,用来展示pod上分配的ipv4和ipv6的信息。.status.podIP字段的值跟.status.podIPs数组的第一个元素的值保持一致。
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139
| apiVersion: v1 kind: Pod metadata: labels: k8s-app: kube-dns pod-template-hash: 558bd4d5db name: coredns-558bd4d5db-b2zbj namespace: kube-system spec: containers: - args: - -conf - /etc/coredns/Corefile image: k8s.gcr.io/coredns/coredns:v1.8.0 imagePullPolicy: IfNotPresent livenessProbe: failureThreshold: 5 httpGet: path: /health port: 8080 scheme: HTTP initialDelaySeconds: 60 periodSeconds: 10 successThreshold: 1 timeoutSeconds: 5 name: coredns ports: - containerPort: 53 name: dns protocol: UDP - containerPort: 53 name: dns-tcp protocol: TCP - containerPort: 9153 name: metrics protocol: TCP readinessProbe: failureThreshold: 3 httpGet: path: /ready port: 8181 scheme: HTTP periodSeconds: 10 successThreshold: 1 timeoutSeconds: 1 resources: limits: memory: 170Mi requests: cpu: 100m memory: 70Mi securityContext: allowPrivilegeEscalation: false capabilities: add: - NET_BIND_SERVICE drop: - all readOnlyRootFilesystem: true terminationMessagePath: /dev/termination-log terminationMessagePolicy: File volumeMounts: - mountPath: /etc/coredns name: config-volume readOnly: true - mountPath: /var/run/secrets/kubernetes.io/serviceaccount name: kube-api-access-hgxnc readOnly: true dnsPolicy: Default enableServiceLinks: true nodeName: dual-stack-control-plane nodeSelector: kubernetes.io/os: linux preemptionPolicy: PreemptLowerPriority priority: 2000000000 priorityClassName: system-cluster-critical restartPolicy: Always schedulerName: default-scheduler securityContext: {} serviceAccount: coredns serviceAccountName: coredns terminationGracePeriodSeconds: 30 tolerations: - key: CriticalAddonsOnly operator: Exists ... volumes: - configMap: defaultMode: 420 items: - key: Corefile path: Corefile name: coredns name: config-volume - name: kube-api-access-hgxnc projected: defaultMode: 420 sources: - serviceAccountToken: expirationSeconds: 3607 path: token - configMap: items: - key: ca.crt path: ca.crt name: kube-root-ca.crt - downwardAPI: items: - fieldRef: apiVersion: v1 fieldPath: metadata.namespace path: namespace status: conditions: - lastProbeTime: null lastTransitionTime: "2022-01-16T12:51:24Z" status: "True" type: Initialized ... containerStatuses: - containerID: containerd://6da36ab908291ca1b4141a86d70f8c2bb150a933336d852bcabe2118aa1a3439 image: k8s.gcr.io/coredns/coredns:v1.8.0 imageID: sha256:296a6d5035e2d6919249e02709a488d680ddca91357602bd65e605eac967b899 lastState: {} name: coredns ready: true restartCount: 0 started: true state: running: startedAt: "2022-01-16T12:51:27Z" hostIP: 172.18.0.2 phase: Running podIP: 10.244.0.3 podIPs: - ip: 10.244.0.3 - ip: fd00:10:244::3 qosClass: Burstable startTime: "2022-01-16T12:51:24Z"
|
pod中要想获取到ipv4、ipv6地址,可以通过downward api的形式将.status.podIPs以环境变量的形式传递到容器中,在pod中通过环境变量获取到的格式为: 10.244.1.4,a00:100::4
。
k8s node
在宿主机上可以通过ip addr show eth0的方式来查看网卡上的ip地址。
1 2 3 4 5 6 7 8 9
| 23: eth0@if24: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default link/ether 02:42:ac:12:00:02 brd ff:ff:ff:ff:ff:ff link-netnsid 0 inet 172.18.0.2/16 brd 172.18.255.255 scope global eth0 valid_lft forever preferred_lft forever inet6 fc00:f853:ccd:e793::2/64 scope global nodad valid_lft forever preferred_lft forever inet6 fe80::42:acff:fe12:2/64 scope link valid_lft forever preferred_lft forever
|
宿主机的网卡上有ipv6地址,k8s node上的.status.addresses中有所体现,type为InternalIP即包含了ipv4地址,又包含了ipv6地址,但此处并没有字段标识当前地址为ipv4,还是ipv6。
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72
| apiVersion: v1 kind: Node metadata: annotations: kubeadm.alpha.kubernetes.io/cri-socket: unix:///run/containerd/containerd.sock node.alpha.kubernetes.io/ttl: "0" volumes.kubernetes.io/controller-managed-attach-detach: "true" labels: beta.kubernetes.io/arch: amd64 beta.kubernetes.io/os: linux kubernetes.io/arch: amd64 kubernetes.io/hostname: dual-stack-control-plane kubernetes.io/os: linux node-role.kubernetes.io/control-plane: "" node-role.kubernetes.io/master: "" node.kubernetes.io/exclude-from-external-load-balancers: "" name: dual-stack-control-plane spec: podCIDR: 10.244.0.0/24 podCIDRs: - 10.244.0.0/24 - fd00:10:244::/64 providerID: kind://docker/dual-stack/dual-stack-control-plane status: addresses: - address: 172.18.0.2 type: InternalIP - address: fc00:f853:ccd:e793::2 type: InternalIP - address: dual-stack-control-plane type: Hostname allocatable: cpu: "4" ephemeral-storage: 41152812Ki hugepages-1Gi: "0" hugepages-2Mi: "0" memory: 15936568Ki pods: "110" capacity: cpu: "4" ephemeral-storage: 41152812Ki hugepages-1Gi: "0" hugepages-2Mi: "0" memory: 15936568Ki pods: "110" conditions: - lastHeartbeatTime: "2022-01-16T15:01:48Z" lastTransitionTime: "2022-01-16T12:50:54Z" message: kubelet has sufficient memory available reason: KubeletHasSufficientMemory status: "False" type: MemoryPressure ... daemonEndpoints: kubeletEndpoint: Port: 10250 images: - names: - k8s.gcr.io/kube-proxy:v1.21.1 sizeBytes: 132714699 ... nodeInfo: architecture: amd64 bootID: 7f95abb9-7731-4a8c-9258-4a91cdcfb2ca containerRuntimeVersion: containerd://1.5.2 kernelVersion: 4.18.0-305.an8.x86_64 kubeProxyVersion: v1.21.1 kubeletVersion: v1.21.1 machineID: 8f6a98bffc184893ab6bc260e705421b operatingSystem: linux osImage: Ubuntu 21.04 systemUUID: f7928fdb-32be-4b6e-8dfd-260b6820f067
|
Ingress
Ingress 可以通过开关 disable-ipv6
来控制是否开启 ipv6,默认 ipv6 开启。
在开启 ipv6 的情况下,如果 nginx ingress 的 pod 本身没有ipv6 的 ip地址,则在 nginx 的配置文件中并不会监听 ipv6 的端口号。
1 2
| listen 80 ; listen 443 ssl http2 ;
|
如果 nginx ingress 的 pod 本身包含 ipv6 地址,则 nginx 的配置文件如下:
1 2 3 4
| listen 80 ; listen [::]:80 ; listen 443 ssl http2 ; listen [::]:443 ssl http2 ;
|
参考资料:
引用