5 Kubernetes Deployment Strategies: Roll Out Like the Pros

What is Kubernetes Deployment Strategy?

A Kubernetes Deployment allows you to declaratively create pods and ReplicaSets. You can define a desired state, and a Deployment Controller continuously monitors the current state of the relevant resources, and deploys pods to match the desired state. It plays a central role in Kubernetes autoscaling. 

A Deployment strategy defines how to create, upgrade, or downgrade different versions of Kubernetes applications. In a traditional software environment, deployments or upgrades to applications result in downtime and disruption of service. Kubernetes can help avoid this, providing several Deployment strategies that let you perform rolling updates to multiple application instances, and avoid or minimize downtime.

In this article, you will learn about the following Kubernetes Deployment strategies:

    1. Rolling deployment—replaces pods running the old version of the application with the new version, one by one, without downtime to the cluster.
    2. Recreate—terminates all the pods and replaces them with the new version.
    3. Ramped slow rollout—rolls out replicas of the new version, while in parallel, shutting down old replicas. 
    4. Best-effort controlled rollout—specifies a “max unavailable” parameter which indicates what percentage of existing pods can be unavailable during the upgrade, enabling the rollout to happen much more quickly.
    5. Canary deployment—uses a progressive delivery approach, with one version of the application serving most users, and another, newer version serving a small pool of test users. The test deployment is rolled out to more users if it is successful.


1. Rolling Deployment 

A rolling deployment is the default deployment strategy in Kubernetes. It replaces the existing version of pods with a new version, updating pods slowly one by one, without cluster downtime. 

The rolling update uses a readiness probe to check if a new pod is ready, before starting to scale down pods with the old version. If there is a problem, you can stop an update and roll it back, without stopping the entire cluster. 

To perform a rolling update, simply update the image of your pods using kubectl set image. This will automatically trigger a rolling update.

To refine your deployment strategy, change the parameters in the spec:strategy section of your manifest file. There are two optional parameters—maxSurge and maxUnavailable

  • MaxSurge specifies the maximum number of pods the Deployment is allowed to create at one time. You can specify this as a whole number (e.g. 5), or as a percentage of the total required number of pods (e.g. 10%, always rounded up to the next whole number). If you do not set MaxSurge, the implicit, default value is 25%.
  • MaxUnavailable specifies the maximum number of pods that are allowed to be unavailable during the rollout. Like MaxSurge, you can define it as an absolute number or a percentage. 

At least one of these parameters must be larger than zero. By changing the values of these parameters, you can define other deployment strategies, as shown below.


2. Recreate 

This is a basic deployment pattern which simply shuts down all the old pods and replaces them with new ones. You define it by setting the spec:strategy:type section of your manifest to Recreate, like this:

Kubernetes recreate new pods code snippet

The Recreate strategy can result in downtime, because old pods are deleted before ensuring that new pods are rolled out with the new version of the application.


3. Ramped Slow Rollout 

A ramped rollout updates pods gradually, by creating new replicas while removing old ones. You can choose the number of replicas to roll out each time. You also need to make sure that no pods become unavailable. 

The difference between this strategy and a regular rolling deployment is you can control the pace at which new replicas are rolled out. For example, you can define that only 1 or 2 nodes should be updated at any one time, to reduce the risk of an update.

To define this behavior, set maxSurge to 1 and maxUnvailable to 0. This means the Deployment will roll one pod at a time, while ensuring no pods are unavailable. So, for example, if there are ten pods, the Deployment will ensure at least ten pods are available at one time.

The following Deployment YAML file performs a ramped rollout. This configuration, and the one shown in the next section, was shared by Gaurav Agarwal.

Kubernetes ramped rollout YAML file snippet


4. Best-Effort Controlled Rollout 

The downside of a ramped rollout is that it takes time to roll out the application, especially at large scale. An alternative is a “best-effort controlled rollout”. This enables a faster rollout, but with a tradeoff of higher risk, by tolerating a certain percentage of downtime among your nodes.

This involves:

  • Setting maxUnavailable to a certain percentage, meaning your update can tolerate a certain amount of pods with downtime.
  • Setting maxSurge to 0, to ensure that there are always the same number of pods in the Deployment. This provides the best possible resource utilization during the update.

This has the effect of rapidly replacing pods, as quickly as possible, while ensuring a limited number of pods down at any given time. 

kubernetes best effort controlled rollout code snippet


5. Canary Deployment

Canary deployments are typically used to test some new features on the backend of an application. Two or more services or versions of an application are deployed in parallel, one running an existing version, and one with new features. Users are gradually shifted to the new version, allowing the new version to be validated by exposing it to real users. If no errors are reported, one of the new versions can be gradually deployed to all users.

See this detailed tutorial on the Kubernetes blog showing how to deploy an existing version and a new canary version, route between them using Gloo subset routing, and then shift all traffic to the new version.


Automating Kubernetes Infrastructure with Spot by NetApp

Spot Ocean from Spot by NetApp frees DevOps teams from the tedious management of their cluster’s worker nodes while helping reduce cost by up to 90%. Spot Ocean’s automated optimization delivers the following benefits:

  • Container-driven autoscaling for the fastest matching of pods with appropriate nodes
  • Easy management of workloads with different resource requirements in a single cluster
  • Intelligent bin-packing for highly utilized nodes and greater cost-efficiency
  • Cost allocation by namespaces, resources, annotation and labels
  • Reliable usage of the optimal blend of spot, reserved and on-demand compute pricing models
  • Automated infrastructure headroom ensuring high availability
  • Right-sizing based on actual pod resource consumption  

Learn more about Spot Ocean today!