How to do maintenance work in kubernetes cluster



  • We just moved to kubernetes, but the engineer that helped launch it is on paternity leave before we had hoped (never trust a baby not to be eager!).

    Now we're trying to do maintenance tasks and one off work and having nodes get killed in the middle of things.

    I've looked at using kubernetes jobs for this, but that's overkill. We don't want to write manifest files for everything.

    We just need long lived shell access to do this and that.

    What's the pattern for this so your maintenance task doesn't get killed?



  • We were https://github.com/freelawproject/courtlistener/issues/2079#issuecomment-1145998549 by following the rules for when a node is terminated. According to the https://github.com/kubernetes/autoscaler/blob/master/cluster-autoscaler/FAQ.md#what-types-of-pods-can-prevent-ca-from-removing-a-node , there are a number of ways that pods can prevent the cluster autoscaler from removing a node. One type of pod is:

    Pods that are not backed by a controller object (so not created by deployment, replica set, job, stateful set etc).

    So our solution is to create a pod in that way via a manifest file. This lets us have a pod named maintenance that sticks around and isn't killed by the cluster autoscaler:

    ---
    apiVersion: v1
    kind: Pod
    metadata:
      name: maintenance
      namespace: blah
      labels:
        type: maintenance
    spec:
      containers:
        - name: web
          image: whatever
          imagePullPolicy: IfNotPresent
          command: [bash]
          stdin: true
          tty: true
    


Suggested Topics

  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2