Getting X11 client work with x2go from docker container

Star InactiveStar InactiveStar InactiveStar InactiveStar Inactive

Having a X11 client running in a docker container to run properly in a x2go session is not quite straight forward from normal X11 clients in docker. To make it work its important to have the option --network=host set.

Like this:

docker run -e HOME=$HOME -e DISPLAY=$DISPLAY --network=host -it -v /tmp/.X11-unix:/tmp/.X11-unix -v $HOME/.Xauthority:$HOME/.Xauthority --user $(id -u):$(id -g) "image-name"

Gluster cluster split in two?

Star InactiveStar InactiveStar InactiveStar InactiveStar Inactive

It can happen that a Gluster cluster gets divided in two parts. I'm not talking a about a volume split brain here but a whole cluster. Something might have gone wrong when probing a node. Or as in our case when adding aliases for nodes, the peer info file was corrupted (seems to be a maximum name length for nodes) which caused some nodes to believe they where in another cluster.

The solution to this is to first decide which nodes you consider as the proper cluster. Running gluster peer status will show you what other nodes are considered to be in the same group as the node you run the status command on. Nodes that are in state "Peer Rejected State" might thing they are part of another cluster. If most of the nodes are in "Peer Rejected State", then probably you should run the command on one of those nodes in rejected state and you will see that most nodes there will be in ok state.

On all those nodes in rejected state, run following procedure:

  1. Stop glusterd
  2. Remove all files from /var/lib/glusterd except the
  3. Start glusterd again
  4. Run a gluster peer probe to a member node.
  5. Restart glusterd again

Other lessons learned:

Do make sure that you save the file, if not a new one will be created and effectively you will be creating a new node, with the same name. To solve this, stop the glusterd daemon on all nodes, remove the faulty uuid from /var/lib/glusterfs/peers and restart glusterd on all nodes again.
I did not find this error immediately and I was strugling with a lot of locking errors in glusterd.log file and any "gluster volume status" command would just hang for ever.

kubeadm upgrade node failing with "failed to get config map: Unauthorized"

Star InactiveStar InactiveStar InactiveStar InactiveStar Inactive

Are you getting below errors when running: kubeadm upgrade node

[upgrade] Reading configuration from the cluster...
[upgrade] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -o yaml'
unable to fetch the kubeadm-config ConfigMap: failed to get config map: Unauthorized
To see the stack trace of this error execute with --v=5 or higher

Chances are that the kubelet certificate is expired and can not be used to upgrade the node. A strace of that command reveals it does not really use the current users config but the file /etc/kubernetes/kubelet.conf. That file does in turn point to the key and cert to be used. In my installation both cert and key refered to /var/lib/kubelet/pki/kubelet-client-current.pem as follows:

client-certificate: /var/lib/kubelet/pki/kubelet-client-current.pem
client-key: /var/lib/kubelet/pki/kubelet-client-current.pem

That file is in turn a symlink to the real file, in my case was /var/lib/kubelet/pki/kubelet-client-2020-06-04-18-38-46.pem - which for some reason was not updated recently.

You need to create an updated cert/key file there and change /var/lib/kubelet/pki/kubelet-client-current.pem to this new file instead. 

Problem solved.

Did your kubelet certificate expire in k8s

Star InactiveStar InactiveStar InactiveStar InactiveStar Inactive

For some reason the kublet selfsigned certificate was expired in my cluster. That is the kubelets own API-service, running on port 10250 (i.e. not the client cert that kubelet uses to talk with api-servers). Its supposed to be a self-signed certificate but it was not renewed.

The problem was not very obvious but we saw it when the metrics-service did not work properly. It complained about expired certificates on for port 10250 on nodes.

I could not find any article about how to re-create this certificate. Sure, kubeadm certs has a lot of renewal options, but not for the actual kublet https port as far as I could find out.

The solution showed up to be quite simple. Just remove the two files /var/lib/kubelet/pki/kubelet.crt and /var/lib/kubelet/pki/kubelet.key and restart the kublet service with systemctl restart kublet.

The kubelet will then generate new self-signed certs.

In the end though, this was shown not to be the problem. First, the metrics service deployment needs to be run with the container argument: --kubelet-insecure-tls

at least if the kubelets run with self-signed certs.

Our root problem was that one api-server was running with a faulty proxy settings which caused its internal call to the metrics server to fail.