Kubernetes Autohealing Support

Aviv Shukron

Product Manager

January 9, 2018

1 min read

Today we announce Kubernetes auto-healing feature, which helps ensure that the running Kubernetes worker nodes are healthy and ready to serve your different pods and application’s needs.

Previously, you could configure auto healing for your Elastigroup cluster by choosing from Spotinst’s variety of health check options such as EC2, Load Balancer, and Http/s endpoint checks. However, these are not ideal for Kubernetes clusters. For example, An instance may fail to join the Kubernetes cluster or enter a fail state due to networking issues that prevent some pods from a normal operation.

The status of each Kubernetes node is represented as a “condition” object, that describes the status of different aspects of the node. The conditions types are: OutOfDisk, Ready, MemoryPressure, DiskPressure, NetworkUnavailable. Each condition type has a status False / True / Unknown

Now, you can leverage the new Kubernetes auto-healing feature “K8S_NODE“, Elastigroup will monitor the nodes’ status every 30 seconds and in case it identifies that the Ready condition is False or Unknown we will consider this instance as Unhealthy and trigger a replacement, maximizing your cluster’s efficiency and performance.

Configuration Example:

{
  .....
  "launchSpecification":{
    .....
    "healthCheckType": "K8S_NODE",
    "healthCheckGracePeriod": 120,
    "healthCheckUnhealthyDurationBeforeReplacement": 300
  }
}