Kubernetes Affinity: The Basics and a Quick Tutorial

What Is Node Affinity? 

Node affinity in Kubernetes is a set of rules used to specify the criteria under which pods should be placed on particular nodes within a cluster. By defining node affinity rules, users can restrict which nodes a pod can be scheduled on based on labels on nodes. This can be important for workload-specific requirements such as ensuring pods only run on nodes with certain hardware.

The node affinity mechanism is vital for optimizing the resources and maintaining the health of applications. For example, it can help in scenarios where certain pods need to be in proximity to specific services or infrastructures, enhancing performance through reduced latency.

This is part of a series of articles about Kubernetes Architecture.

In this article:

What Is Scheduling in Kubernetes? 

Scheduling in Kubernetes involves the process of deciding which node a pod runs on. The Kubernetes scheduler determines the best available node for the pod based on several factors such as resource availability, quality of service requirements, and other policies like affinity and anti-affinity. 

The scheduler works to ensure that, as far as possible, the allocation of pods across the nodes in the cluster meets all specified requirements. This decision-making process is highly pluggable, allowing various scheduling policies to be applied to alter the behavior of the cluster. 

The scheduler’s flexibility enables administrators to finely tune the scheduling algorithms according to the specific needs of their applications and the characteristics of their cluster’s hardware.

Kubernetes Affinity vs. Kubernetes Taints

Affinity and taints are both critical concepts in Kubernetes, helpful in managing how pods are scheduled in a Kubernetes cluster. However, they play opposing roles: 

  • Affinity attracts pods to certain nodes, enhancing the scheduler’s ability to place pods based on preferred criteria. 
  • Taints repel pods from certain nodes unless the pod has a corresponding tolerance, thereby preventing specific pods from being scheduled onto a node unless explicitly permitted.

While affinity rules make nodes attractive to certain pods, taints are used to ensure that nodes reject certain pods unless overridden by tolerations. This complementarity ensures a more versatile management of pod placement across the cluster, enhancing both performance and resilience.

Types of Kubernetes Affinity 

The concept of affinity in Kubernetes can apply to nodes and pods, and is also useful for enabling anti-affinity.

Node Affinity

Kubernetes node affinity rules limit pod placement to specific nodes. This can be declared as either required or preferred. Required rules must be met for a pod to be scheduled on a node, while preferred rules suggest guidelines that the scheduler attempts to enforce but does not guarantee. 

With node affinity, users can fine-tune their cluster’s utilization of resources, ensuring that pods are scheduled on the most appropriate nodes. For example, pods requiring large amounts of memory can be directed to nodes that have ample available memory, aligning resource needs with availability.

Inter-Pod Affinity

Inter-pod affinity in Kubernetes allows you to control the placement of pods relative to other pods. It can be used to ensure that certain sets of pods are kept together in the same node or spread across different nodes for high availability. This type of affinity is useful for maintaining high-service levels and reducing latency between tightly coupled applications.

This affinity mechanism enhances overall application performance by strategically locating pods that need to frequently communicate closer together. Conversely, it can also be used to distribute pods across different nodes to prevent a single point of failure in a system.

Anti-Affinity

Kubernetes anti-affinity is used to explicitly separate pods from each other, across multiple nodes. It is particularly useful in scenarios where you want to avoid running multiple instances of a specific pod on the same node. This may be necessary for high-availability applications, such as distributed databases that require replication across several nodes to avoid a single point of failure.

Anti-affinity rules help in distributing the load and in enhancing the resilience of the system by preventing too many similar pods from being concentrated in a single node. This distribution aids in load balancing and reduces risks associated with hardware failures, maintaining service continuity.

Tutorial: Assign Pods to Nodes Using Node Affinity

Here’s an overview of how to leverage Kubernetes node affinity to assign pods to a desired node. The code in this section was adapted from the Kubernetes documentation.

Prerequisites

Before you start assigning pods to specific nodes using node affinity in Kubernetes, ensure that you have a functional Kubernetes cluster. Your cluster should have at least two nodes that do not serve as control plane hosts. Make sure your Kubernetes server version is at least v1.10. You can check your server version by running kubectl version in your command line.

Add a Label to a Node

To utilize node affinity effectively, you need to label the nodes in the cluster where you intend to schedule your pods. 

For example, if you want to assign a pod to a node with a GPU, label your chosen node with hardware=gpu by running:

kubectl label nodes <my-node> hardware=gpu

To confirm that the label has been added successfully, run the kubectl get nodes --show-labels command and look for the hardware=gpu label in the output.

Learn more in our detailed guide to Kubernetes labels 

Schedule a Pod with Required Node Affinity

To schedule a pod on a node that must meet certain criteria, use required node affinity. This ensures the pod will only be placed on a node with the specified label. To schedule a pod on the node you labeled earlier, use a pod manifest that specifies node affinity for hardware=gpu. Here’s how you define this in your pod specification:

apiVersion: v1
kind: Pod
metadata:
name: busybox
spec:
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: hardware
operator: In
values:
- gpu
containers:
- name: busybox
image: busybox
imagePullPolicy: IfNotPresent

Save this configuration as busybox-required-affinity.yaml and apply it using:

kubectl apply -f busybox-required-affinity.yaml

Verify that the pod has been scheduled on the intended node by running:

kubectl get pods --output=wide

This command will show you the pod’s node placement along with other details such as IP and status.

Schedule a Pod with Preferred Node Affinity

If you want a pod to preferentially schedule on a node but not exclusively, use preferred node affinity. This method suggests, but does not strictly enforce, pod placement. Here’s an example of a pod manifest with preferred node affinity for hardware=gpu:

apiVersion: v1
kind: Pod
metadata:
name: busybox
spec:
affinity:
nodeAffinity:
preferredDuringSchedulingIgnoredDuringExecution:
- weight: 1
preference:
matchExpressions:
- key: hardware
operator: In
values:
- gpu
containers:
- name: busybox
image: busybox
imagePullPolicy: IfNotPresent

To deploy this pod configuration, save the configuration as

busybox-preferred-affinity.yaml execute:
kubectl apply -f busybox-preferred-affinity.yaml

Check the pod’s placement with:

kubectl get pods --output=wide

This will indicate whether the pod was able to be scheduled on a preferred node, taking into consideration other factors like cluster load and resource availability.

Related content: Read our guide to Kubernetes pod

Kubernetes Node Affinity Best Practices 

Here are some measures to help ensure the most effective use of node affinity in Kubernetes.

Use Node Affinity Sparingly

While node affinity can optimize resource use and control where pods are placed, overly restrictive affinity rules can lead to pod scheduling failures or limited scalability. It’s important to balance the affinity rules with the cluster’s capacity and the flexibility needs of other applications. Applying node affinity rules judiciously helps ensure that the overall health and performance of the cluster are not compromised.

Consider the Cluster Topology

Understanding the physical and logical topology of your Kubernetes cluster is crucial when using node affinity. Affinity rules based on node labels reflect the underlying properties such as region, availability zone, or any custom labels describing the topology. This awareness can prevent common issues such as unintentional scheduling of multiple critical pods in a single zone, which could be affected by zone-specific failures.

Accurate topology knowledge enables the creation of affinity rules that improve resilience and efficiency. For example, spreading load across different physical locations to protect against localized failures.

Combine Node Affinity with Other Scheduling Policies

Additional scheduling policies like pod anti-affinity and taints/tolerations can complement affinity rules, improving your ability to fine-tune pod placement. This synergy between different policies allows for more precise control over where pods are scheduled based on a wider array of criteria and conditions.

Prioritize Preferred Affinity 

Whenever feasible, use preferred affinity (preferredDuringSchedulingIgnoredDuringExecution) rather than required affinity (requiredDuringSchedulingIgnoredDuringExecution). Also known as soft affinity, this approach provides guidelines to the scheduler but does not rigidly enforce them, offering more flexibility. It reduces the risk of unschedulable pods, enhancing scheduling success rates.

Automating Kubernetes Infrastructure with Spot by NetApp

Spot Ocean from Spot by NetApp frees DevOps teams from the tedious management of their cluster’s worker nodes while helping reduce cost by up to 90%. Spot Ocean’s automated optimization delivers the following benefits:

  • Container-driven autoscaling for the fastest matching of pods with appropriate nodes
  • Easy management of workloads with different resource requirements in a single cluster
  • Intelligent bin-packing for highly utilized nodes and greater cost-efficiency
  • Cost allocation by namespaces, resources, annotation and labels
  • Reliable usage of the optimal blend of spot, reserved and on-demand compute pricing models
  • Automated infrastructure headroom ensuring high availability
  • Right-sizing based on actual pod resource consumption  

Learn more about Spot Ocean today!