How to do maintenance work in kubernetes cluster
-
We just moved to kubernetes, but the engineer that helped launch it is on paternity leave before we had hoped (never trust a baby not to be eager!).
Now we're trying to do maintenance tasks and one off work and having nodes get killed in the middle of things.
I've looked at using kubernetes jobs for this, but that's overkill. We don't want to write manifest files for everything.
We just need long lived shell access to do this and that.
What's the pattern for this so your maintenance task doesn't get killed?
-
We were https://github.com/freelawproject/courtlistener/issues/2079#issuecomment-1145998549 by following the rules for when a node is terminated. According to the https://github.com/kubernetes/autoscaler/blob/master/cluster-autoscaler/FAQ.md#what-types-of-pods-can-prevent-ca-from-removing-a-node , there are a number of ways that pods can prevent the cluster autoscaler from removing a node. One type of pod is:
Pods that are not backed by a controller object (so not created by deployment, replica set, job, stateful set etc).
So our solution is to create a pod in that way via a manifest file. This lets us have a pod named
maintenance
that sticks around and isn't killed by the cluster autoscaler:--- apiVersion: v1 kind: Pod metadata: name: maintenance namespace: blah labels: type: maintenance spec: containers: - name: web image: whatever imagePullPolicy: IfNotPresent command: [bash] stdin: true tty: true