Following k0sctl instructions verbatim results in hung script

Hi everyone,

I am new; I generated SSH keys, copied them over, checked that they worked, enabled passwordless sudo on the user, then copied & pasted (other than the usernames, IP address and SSH port changes) the script from the Quickstart guide.

Here’s the output

⣿⣿⡇⠀⠀⢀⣴⣾⣿⠟⠁⢸⣿⣿⣿⣿⣿⣿⣿⡿⠛⠁⠀⢸⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⠀█████████ █████████ ███
⠀⣿⣿⡇⣠⣶⣿⡿⠋⠀⠀⠀⢸⣿⡇⠀⠀⠀⣠⠀⠀⢀⣠⡆⢸⣿⣿⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀███ ███ ███
⠀⣿⣿⣿⣿⣟⠋⠀⠀⠀⠀⠀⢸⣿⡇⠀⢰⣾⣿⠀⠀⣿⣿⡇⢸⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⠀███ ███ ███
⠀⣿⣿⡏⠻⣿⣷⣤⡀⠀⠀⠀⠸⠛⠁⠀⠸⠋⠁⠀⠀⣿⣿⡇⠈⠉⠉⠉⠉⠉⠉⠉⠉⢹⣿⣿⠀███ ███ ███
⠀⣿⣿⡇⠀⠀⠙⢿⣿⣦⣀⠀⠀⠀⣠⣶⣶⣶⣶⣶⣶⣿⣿⡇⢰⣶⣶⣶⣶⣶⣶⣶⣶⣾⣿⣿⠀█████████ ███ ██████████
k0sctl v0.15.5 Copyright 2023, k0sctl authors.
Anonymized telemetry of usage will be sent to the authors.
By continuing to use k0sctl you agree to these terms:

INFO ==> Running phase: Connect to hosts
INFO [ssh] 154.53.59.199:53572: connected
INFO [ssh] 66.94.112.149:53572: connected
INFO ==> Running phase: Detect host operating systems
INFO [ssh] 66.94.112.149:53572: is running Ubuntu 22.04.3 LTS
INFO [ssh] 154.53.59.199:53572: is running Ubuntu 22.04.3 LTS
INFO ==> Running phase: Acquire exclusive host lock
INFO ==> Running phase: Prepare hosts
INFO ==> Running phase: Gather host facts
INFO [ssh] 154.53.59.199:53572: using node1 as hostname
INFO [ssh] 66.94.112.149:53572: using node2 as hostname
INFO [ssh] 154.53.59.199:53572: discovered eth0 as private interface
INFO [ssh] 66.94.112.149:53572: discovered eth0 as private interface
INFO ==> Running phase: Validate hosts
INFO ==> Running phase: Gather k0s facts
INFO ==> Running phase: Validate facts
INFO ==> Running phase: Configure k0s
WARN [ssh] 154.53.59.199:53572: generating default configuration
INFO [ssh] 154.53.59.199:53572: validating configuration
INFO [ssh] 154.53.59.199:53572: configuration was changed
INFO ==> Running phase: Initialize the k0s cluster
INFO [ssh] 154.53.59.199:53572: installing k0s controller
INFO [ssh] 154.53.59.199:53572: waiting for the k0s service to start
INFO [ssh] 154.53.59.199:53572: waiting for kubernetes api to respond
INFO ==> Running phase: Install workers
INFO [ssh] 66.94.112.149:53572: validating api connection to https://154.53.59.199:6443

And then it just gets stuck…

WHat’s more is the same thing happens if I add 10 nodes. It always get stuck on one node (and it’s never the same node IP). Logging in to the node that gets stuck reveals nothing interesting (top/ps aux show no processes are stuck), checking the controller systemctl status, reveals its active).

I am not doing anything fancy.here. Just a basic setup.

Can anyone provide any guidance? I really want to avoid having to setup a full fledged k8s via kubeadm. The motivation behind k0s is interesting, but it’s not proving to be “FRICTIONLESS”…

Thanks.

And in case someone asks, here’s my k0sctl.yaml file.

apiVersion: k0sctl.k0sproject.io/v1beta1
kind: Cluster
metadata:
name: k0s-cluster
spec:
hosts:

  • ssh:
    address: 154.53.59.199
    user: ansible_user
    port: 53572
    keyPath: /home/ansible_user/.ssh/id_rsa
    role: controller
  • ssh:
    address: 66.94.112.149
    user: ansible_user
    port: 53572
    keyPath: /home/ansible_user/.ssh/id_rsa
    role: worker
    k0s:
    version: 1.27.5+k0s.0
    dynamicConfig: false

Basic stuff…

Hi!

According to the logs, k0sctl tries to validate the connection to the kube-apiserver on the 6443 port. The connection gets stuck, which usually means that there is a firewall, blocking the connection. Please, check if the 6443 port is open on your firewall.

Check these docs to get more information about networking in k0s

2 Likes

Hi,
Besides of what @makhov said (which is the ultimate cause), most certainly the script is not stuck, it has an excessive timeout. After 15 minutes it will continue. I will make a PR to reduce this to a more reasonable number

2 Likes

Thank you :pray: Alexey @makhov & Juan-Luis @juanluisvaladas. Resolution was simple enough.

I see you’re moderators. Maintainers as well?

Re: the PR: @juanluisvaladas perhaps, a more indicative message when waiting (example: Retry attempts, no response, checks?, etc…)?

Next steps for me
My goal is to build as thin a 100% kubernetes cluster as possible but include lstio, monitoring (prom, jaeger, fluentd, otel), plus calico (need netsec) + Rook. I’ve torn down the k0s cluster & started a more complex setup (HA cluster, calico cni, etc…). Other things & breaking & not as obvious given thin “pre-flight” checks.

Any advice? I will post step by step instructions as I get through it all.

Re: pre-flight checks, some suggestive feedback:

  1. At a minimum: Pre-flight check instructions be a little more imperative?! For example, the current version of the pre-flight check in the networking page explicitly states that: “When deploying k0s with the default settings, all pods on a node can communicate with all pods on all nodes. No configuration changes are needed to get started.” That is misleading. No? The table at the bottom doesn’t make an explicit request. I’ve used Kubespray/Ansible & the ansible playbook kubespray uses does open up the ports.

  2. Better (and harder): Include a check in the k0sctl sequence for the default firewall & required ports that need to be open and provide feedback when not. I realize option 2 makes this a layered Ansible-like playbook and gets hairy over time (it’s not just networking, but, a full-fledged roadmap) & I’m sure many thought of this before I suggested it plus #2 is complex (differs by OS, firewall installed, etc…).

So maybe just #1 would help. Perhaps, I’m missing something.

Again thank you for your time. Cheers!