Extremely High CPU usage with simple Multi Node cluster. (kube-apis)

rohanrehman · December 11, 2023, 2:49pm

Deployed a bare metal arm64 Multi Node cluster via k0sctl v0.16.0.

apiVersion: k0sctl.k0sproject.io/v1beta1
kind: Cluster
metadata:
  name: k0s-cluster
spec:
  hosts:
  - role: controller+worker
    noTaints: false
    ssh:
      address: 192.168.1.110 
      user: root
      keyPath: ~/.ssh/id_rsa
  - role: controller+worker
    noTaints: false
    ssh:
      address: 192.168.1.114  
      user: root
      keyPath: ~/.ssh/id_rsa
  - role: controller+worker
    noTaints: false
    ssh:
      address: 192.168.1.112  
      user: root
      keyPath: ~/.ssh/id_rsa
  - role: worker
    ssh:
      address: 192.168.1.113 
      user: root
      keyPath: ~/.ssh/id_rsa              

  k0s:
    version: v1.28.4+k0s.0
    dynamicConfig: false
    config: {}

The controler nodes are a 6 core arm sbc.
On all 3 of them have every cpu core idling at 50% 60%

kube-api at 149% usage.
command lsited on htop being /var/lib/k0s/bin/kube-apiserver --service-account-key....

Is this normal?

twieczorek · December 11, 2023, 5:01pm

Looks like you’re missing some sort of load-balancing in your setup.

See:

rohanrehman · December 12, 2023, 5:04am

Would installing Traefik solve the issue, as it is a load balancer as well?

Or do I need Envoy and Traefik?

It seemed that I need Envoy, my working yaml can be see here.

But that Yaml only fixes the high cpu issue.
I have created a seperate post as it’s a bigger issue with installing Traefik as a loadbalancer with k0s.

twieczorek · December 12, 2023, 10:04am

When talking about load balancing in this context here, I mean load balancing in front of the control plane, i.e. the Kubernetes API server processes and so on. The docs you linked are showing an example of how to use Traefik as an ingress controller. Ingress controllers are a way of exposing workloads running in the cluster to the outside world, which is different.

For control plane load balancing, the only built-in option is node-local load balancing. If you want to configure the load balancing yourself, you can use any load balancer you like, including Traefik, but the aforementioned docs describe a different use case.

rohanrehman · December 12, 2023, 1:17pm

Thanks for the reply Tom.
Ultimately if I do use Traefik ingress controller should I be using Envoy local node balancer?

As without it I had the high cpu usage.

… you can use any load balancer you like, including Traefik.

I’de really appreciate practical help on creating a functional yaml for this.

I want to leverage Traefik’s http3 abilities. All I trying to do is use k0s in a multi node setup with Traefik working.

twieczorek · December 14, 2023, 10:52am

Ultimately if I do use Traefik ingress controller should I be using Envoy local node balancer?

Both are independent things and not connected to each other. For an HA control plane, you need some sort of load balancing for the Kubernetes control plane itself to get a working cluster. This is a hard requirement; without it, your cluster will be dysfunctional. Enabling node-local load balancing in the k0s configuration is the easiest way to to do this.

Installing an ingress controller, on the other hand, is an optional choice. There are a number of different implementations available, including the Traefik ingress controller. The same goes for MetalLB. This is optional and depends on your needs.

I’de really appreciate practical help on creating a functional yaml for this.

I want to leverage Traefik’s http3 abilities. All I trying to do is use k0s in a multi node setup with Traefik working.

Alright, let’s try this. The example in the docs seems to be outdated, indeed. Based on the configuration you shared, I’ve tried to put together a little walkthrough that will hopefully get you started with node-local load balancing, MetalLB and the Traefik ingress controller. Please also take a look at the MetalLB and Traefik ingress controller docs. They contain a lot of details that are valuable to know. I haven’t gone into much detail explaining all of this.

Try to start with the following k0sctl config and apply it via k0sctl:

apiVersion: k0sctl.k0sproject.io/v1beta1
kind: Cluster
metadata:
  name: k0s-cluster
spec:
  hosts:
    - role: controller+worker
      ssh:
        address: 192.168.1.110
        user: root
        keyPath: ~/.ssh/id_rsa
      installFlags:
        - --labels=rohanrehmann.local/node-subnet=c-one
      noTaints: true
    - role: controller+worker
      ssh:
        address: 192.168.1.114
        user: root
        keyPath: ~/.ssh/id_rsa
      installFlags:
        - --labels=rohanrehmann.local/node-subnet=c-one
      noTaints: true
    - role: controller+worker
      ssh:
        address: 192.168.1.112
        user: root
        keyPath: ~/.ssh/id_rsa
      installFlags:
        - --labels=rohanrehmann.local/node-subnet=c-one
      noTaints: true
  k0s:
    version: v1.28.4+k0s.0
    dynamicConfig: true
    config:
      spec:
        network:
          nodeLocalLoadBalancing:
            enabled: true

That should bring up a rather empty k0s cluster using an HA control plane. Configure your local environment to use the cluster as its default, for example like this:

$ k0sctl kubeconfig > kubeconfig

$ export KUBECONFIG="$(pwd)/kubeconfig"

$ kubectl get pod -A
NAMESPACE        NAME                                  READY   STATUS    RESTARTS   AGE
kube-system      coredns-85df575cdb-f2xn6              1/1     Running   0          19m
kube-system      coredns-85df575cdb-lx7d9              1/1     Running   0          18m
kube-system      konnectivity-agent-cllpz              1/1     Running   0          18m
kube-system      konnectivity-agent-fs4dq              1/1     Running   0          18m
kube-system      konnectivity-agent-p2p8z              1/1     Running   0          18m
kube-system      kube-proxy-ccpdf                      1/1     Running   0          18m
kube-system      kube-proxy-vqcf4                      1/1     Running   0          19m
kube-system      kube-proxy-w8tfl                      1/1     Running   0          18m
kube-system      kube-router-9kkj5                     1/1     Running   0          18m
kube-system      kube-router-fnzxr                     1/1     Running   0          18m
kube-system      kube-router-w9blz                     1/1     Running   0          19m
kube-system      metrics-server-7556957bb7-g9tkx       1/1     Running   0          19m
kube-system      nllb-controller-0                     1/1     Running   0          17m
kube-system      nllb-controller-1                     1/1     Running   0          17m
kube-system      nllb-controller-2                     1/1     Running   0          17m

Now you can install the MetalLB and Traefik ingress controller Helm charts:

cat | kubectl -n kube-system patch clusterconfig k0s --type=merge --patch-file=/dev/stdin <<'EOF'
spec:
  extensions:
    helm:
      repositories:
        - name: metallb
          url: https://metallb.github.io/metallb
        - name: traefik
          url: https://traefik.github.io/charts
      charts:
        - name: metallb
          chartname: metallb/metallb
          version: 0.13.12
          namespace: metallb-system
        - name: traefik
          chartname: traefik/traefik
          version: 26.0.0
          namespace: traefik-system
          values: |
            deployment:
            replicas: 2

            podDisruptionBudget:
            enabled: true
            minAvailable: 1

            affinity:
              podAntiAffinity:
                requiredDuringSchedulingIgnoredDuringExecution:
                  - topologyKey: kubernetes.io/hostname
                    labelSelector:
                      matchLabels:
                        app.kubernetes.io/name: '{{ template "traefik.name" . }}'
                        app.kubernetes.io/instance: '{{ .Release.Name }}-{{ .Release.Namespace }}'

            providers:
              kubernetesIngress:
                publishedService:
                  enabled: true
EOF

After that, your cluster should look similar to this:

$ kubectl get pod -A
NAMESPACE        NAME                                  READY   STATUS    RESTARTS   AGE
kube-system      coredns-85df575cdb-f2xn6              1/1     Running   0          23m
kube-system      coredns-85df575cdb-lx7d9              1/1     Running   0          23m
kube-system      konnectivity-agent-cllpz              1/1     Running   0          23m
kube-system      konnectivity-agent-fs4dq              1/1     Running   0          22m
kube-system      konnectivity-agent-p2p8z              1/1     Running   0          22m
kube-system      kube-proxy-ccpdf                      1/1     Running   0          23m
kube-system      kube-proxy-vqcf4                      1/1     Running   0          23m
kube-system      kube-proxy-w8tfl                      1/1     Running   0          22m
kube-system      kube-router-9kkj5                     1/1     Running   0          23m
kube-system      kube-router-fnzxr                     1/1     Running   0          22m
kube-system      kube-router-w9blz                     1/1     Running   0          23m
kube-system      metrics-server-7556957bb7-g9tkx       1/1     Running   0          23m
kube-system      nllb-controller-0                     1/1     Running   0          22m
kube-system      nllb-controller-1                     1/1     Running   0          21m
kube-system      nllb-controller-2                     1/1     Running   0          21m
metallb-system   metallb-controller-5f9bb77dcd-tfkfc   1/1     Running   0          18m
metallb-system   metallb-speaker-45mb9                 4/4     Running   0          18m
metallb-system   metallb-speaker-jpkwh                 4/4     Running   0          18m
metallb-system   metallb-speaker-xjkm5                 4/4     Running   0          18m
traefik-system   traefik-7945f7748f-6knlh              1/1     Running   0          16m
traefik-system   traefik-7945f7748f-rw6q7              1/1     Running   0          16m

Now you need to setup the MetalLB IP address ranges and enable Layer 2 advertisements. This is an example:

cat | kubectl apply -f - <<'EOF'
apiVersion: metallb.io/v1beta1
kind: IPAddressPool
metadata:
  name: c-hundred
  namespace: metallb-system
spec:
  addresses:
    - 192.168.100.0/24
  avoidBuggyIPs: true

---
apiVersion: metallb.io/v1beta1
kind: L2Advertisement
metadata:
  name: c-one
  namespace: metallb-system
spec:
  ipAddressPools:
    - c-hundred
  nodeSelectors:
    - matchLabels:
        rohanrehmann.local/node-subnet: c-one
EOF

By now, the ingress controller’s Service should have an external IP:

$ kubectl -n traefik-system get service
NAME      TYPE           CLUSTER-IP       EXTERNAL-IP     PORT(S)                      AGE
traefik   LoadBalancer   10.111.117.250   192.168.100.1   80:31178/TCP,443:30206/TCP   17m

Deploy some workload and expose it via ingress. Again, an example:

cat | kubectl -n default apply -f - <<'EOF'
apiVersion: apps/v1
kind: Deployment
metadata:
  name: nginx-example
  labels:
    app.kubernetes.io/name: nginx
    app.kubernetes.io/instance: nginx-example
    app.kubernetes.io/version: 1.25.3
spec:
  replicas: 2
  selector:
    matchLabels:
      app.kubernetes.io/name: nginx
      app.kubernetes.io/instance: nginx-example
  template:
    metadata:
      labels:
        app.kubernetes.io/name: nginx
        app.kubernetes.io/instance: nginx-example
        app.kubernetes.io/version: 1.25.3
    spec:
      containers:
        - name: nginx
          image: docker.io/library/nginx:1.25.3-alpine3.18-slim
          ports:
            - name: http
              containerPort: 80

---
apiVersion: v1
kind: Service
metadata:
  name: nginx-example
  labels:
    app.kubernetes.io/name: nginx
    app.kubernetes.io/instance: nginx-example
spec:
  type: ClusterIP
  selector:
    app.kubernetes.io/name: nginx
    app.kubernetes.io/instance: nginx-example
  ports:
    - name: http
      port: 80
      targetPort: http

---
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: nginx-example
  labels:
    app.kubernetes.io/name: nginx
    app.kubernetes.io/instance: nginx-example
spec:
  rules:
    - host: example.rohanrehmann.local
      http:
        paths:
          - path: /
            pathType: Prefix
            backend:
              service:
                name: nginx-example
                port:
                  name: http
EOF

Let’s inspect the Ingress resource:

$ kubectl -n default get ingress nginx-example -owide
NAME            CLASS     HOSTS                        ADDRESS         PORTS   AGE
nginx-example   traefik   example.rohanrehmann.local   192.168.100.1   80      12m

Given the above information, you should be able to reach nginx using the host name listed in the HOSTS column and the external IP listed in the ADDRESS column:

$ ssh root@192.168.1.110 curl -sS -H "'Host: example.rohanrehmann.local'" http://192.168.100.1/
<!DOCTYPE html>
<html>
<head>
<title>Welcome to nginx!</title>
<style>
html { color-scheme: light dark; }
body { width: 35em; margin: 0 auto;
font-family: Tahoma, Verdana, Arial, sans-serif; }
</style>
</head>
<body>
<h1>Welcome to nginx!</h1>
<p>If you see this page, the nginx web server is successfully installed and
working. Further configuration is required.</p>

<p>For online documentation and support please refer to
<a href="http://nginx.org/">nginx.org</a>.<br/>
Commercial support is available at
<a href="http://nginx.com/">nginx.com</a>.</p>

<p><em>Thank you for using nginx.</em></p>
</body>
</html>

If your local machine is on the same network as the Kubernetes cluster, then it
should work from your local machine as well:

$ curl -sS -H 'Host: example.rohanrehmann.local' http://192.168.100.1/
<!DOCTYPE html>
<html>
<head>
<title>Welcome to nginx!</title>
<style>
html { color-scheme: light dark; }
body { width: 35em; margin: 0 auto;
font-family: Tahoma, Verdana, Arial, sans-serif; }
</style>
</head>
<body>
<h1>Welcome to nginx!</h1>
<p>If you see this page, the nginx web server is successfully installed and
working. Further configuration is required.</p>

<p>For online documentation and support please refer to
<a href="http://nginx.org/">nginx.org</a>.<br/>
Commercial support is available at
<a href="http://nginx.com/">nginx.com</a>.</p>

<p><em>Thank you for using nginx.</em></p>
</body>
</html>

If that doesn’t work and you get an output like curl: (7) Failed to connect to 192.168.100.1 port 80 after 30 ms: Couldn't connect to server, then your local machine is not on the same network, and you need to configure the IP routing tables accordingly.