Kubernetes StatefulSets: Scaling & Managing Persistent Apps

What is Kubernetes StatefulSets?

A Kubernetes StatefulSet is a workload API resource object. In Kubernetes: 

  • An object—lets you define how processes are automated
  • A workload—is the application running on Kubernetes
  • A resource—stores a collection of objects
  • A workload resource—automates the management of resources (pods—a collection of containers), using controllers that maintain the desired state. 

There are several built-in Kubernetes workload resources, each designed for certain purposes. StatefulSets are designed to help you efficiently manage stateful applications.

You can use StatefulSets to define how a collection of pods are deployed and scaled, and maintain the state of these pods. 

StatefulSets require the use of a pod template, which contains specifications for attached pods. The pods template specifications determine the build of each pod, including the application that will run inside the containers, the volumes, labels, selectors, and other configurations. 

The information, configuration, and any resilient data for StatefulSets pods is maintained in the persistent storage associated with the StatefulSet. You can use StatefulSets for workloads like MySQL and Kafka, which require persistent storage capabilities, like stable hostnames and persistent identities. 

In this article, you will learn:

StatefulSet Pod Components

Here are the main components of StatefulSets:

  • Pod name label—the StatefulSet controller adds a label to every pod it creates. Here is how the label looks like: statefulset.kubernetes.io/pod-name. The label enables you to attach a service to any specific pod (provided the pod is attached to the StatefulSet).
  • Pod selector—lets you match a StatefulSet to the relevant label. If you are using versions 1.8 and later, you must specify the pod selector. Otherwise, you will get a validation error.
  • Pod identity—is composed of several components, including an ordinal, stable storage, and a stable network identity. The pod retains this identity regardless of the node it is scheduled or rescheduled to run on.
  • Ordinal index—a set of numbers that define the order of a collection of objects. In a StatefulSet with N replicas, each pod can be assigned an integer ordinal of o up and through N-1. Each ordinal assignment should be unique.  
  • Stable network ID—consists of several components, including the hostname, a headless service, a domain, and a DNS subdomain. The hostname of each pod is derived from the name of the StatefulSet plus the ordinal of the pod. 
  • Headless services—are not assigned by default—you need to configure this setting, which is critical to the network identity of the pod.
  • Stable storage—achieved through the use of PersistentVolumes (PVs), which are storage resources. Kubernetes creates one PV per one VolumeClaimTemplate, which is a list of the claims pods can reference. You can define a StorageClass or use the default. Learn more in our guide to Kubernetes Persistent Volume.


Working with Kubernetes StatefulSets

How to Create a StatefulSet?

You can create a StatefulSet using the kubectl apply command. This command accepts a manifest file, and uses it to create, modify, or delete cluster resources. For example:

kubectl apply -f manifest-file.yaml

Where manifest-file.yaml is the manifest.

The following is an example manifest file provided by Google Cloud. It shows how to create a StatefulSet managed by an existing service.

manifest file to create a StatefulSet in Google Cloud

In this example, statefulset-name is an identifier for the StatefulSet, service-name is the name of the service that manages the StatefulSet, and app-name is the application running on StatefulSet pods. container-name identifies which container should run in the pods. port-name is the port the StatefulSet exposes on the containers, and pvc-name is the PersistentVolumeClaim used to request persistent storage. 

How to Update a Kubernetes StatefulSet

You can update a StatefulSet by modifying its manifest file, and then run the (above) command that creates StatefulSets.

When you modify the manifest, you need to choose an update strategy by updating the value of .spec.updateStrategy.type. The update will be initiated according to the strategy you choose. 

Here are the two most commonly used StatefulSet update strategies:

  • OnDelete—this value means that pods are not deleted when you update the manifest. Before a new version can be created, you need to manually delete the existing pods.
  • RollingUpdate—this value means that pods are removed and replaced in reverse ordinal order. You can also use .spec.updateStrategy.rollingUpdate.partition to create a phased rollout.


Kubernetes StatefulSets Limitations

Here are the main limitations of Kubernetes StatefulSets:

  • StatefulSet is available for Kubernetes 1.9 and later versions—before 1.9 StatefulSets worked as beta resources before 1.9 and were not available at all in Kubernetes releases prior to version 1.5.
  • Persistent storage must be pre-configured—you can use static or dynamic provisioning to meet persistent storage requests. To do this, you need to use PVs and PersistentVolumeClaims (PVCs).
  • Volumes are not automatically deleted—when StatefulSets are deleted or scaled down, volumes associated with the StatefulSet are not deleted. This mechanism was set in place to ensure data safety.
  • You need to create a Headless Service—which is responsible for the network identity of your pods. 
  • Once a StatefulSet is deleted there are no guarantees—to ensure that pods are properly terminated, you can first scale the StatefulSet to 0 and then delete it.


StatefulSet vs Deployment

In Kubernetes, a Deployment is a workload resource object that lets you configure the lifecycle of pods in the cluster. Just like StatefulSets, Kubernetes Deployments let you define the state of the application, and the Deployment’s controller is responsible for maintaining this state. Kubernetes Deployments are immutable and work using ReplicaSets, which automatically create pods according to your specifications. 

Unlike Deployments, which are stateless, StatefulSets let you persist various aspects across pod rescheduling, those could be storage and/or a stable network identity for your pod. 

To summarize, the main difference between deployments and StatefulSets is that a deployment is designed for stateless applications and StatefulSets are designed for stateful applications.


Kubernetes Infrastructure Automation with Spot.io

Spot Ocean from Spot by NetApp frees DevOps teams from the tedious management of their cluster’s worker nodes while helping reduce cost by up to 90%. Spot Ocean’s automated optimization delivers the following benefits:

  • Container-driven autoscaling for the fastest matching of pods with appropriate nodes
  • Easy management of workloads with different resource requirements in a single cluster
  • Intelligent bin-packing for highly utilized nodes and greater cost-efficiency
  • Cost allocation by namespaces, resources, annotation and labels
  • Reliable usage of the optimal blend of spot, reserved and on-demand compute pricing models
  • Automated infrastructure headroom ensuring high availability
  • Right-sizing based on actual pod resource consumption  

Learn more about Spot Ocean today!