Kubernetes Labels: A Practical Guide

What Are Kubernetes Labels? 

Kubernetes Labels are key-value pairs associated with Kubernetes objects. These labels are intended to identify attributes of the objects that are meaningful and relevant to users, but do not directly imply semantics to the core system. Labels are versatile and can be utilized to organize and select subsets of objects.

Suppose you have multiple pods running on different environments like development, testing, and production. You can assign labels to these pods to identify which environment they belong to. A label could be something like “environment=development”. This way, you can easily filter out the pods based on their environment.

Labels can be attached to objects at the time of creation or later on, and each object can have several key-value pairs. Label selectors are the primary grouping primitive in Kubernetes and are utilized by users to select a set of objects. The Kubernetes API currently supports two types of selectors: equality-based and set-based.

This is part of a series of articles about Kubernetes architecture

In this article:

When Should You Use Kubernetes Labels? 

Group Resources for Object Queries

One of the most common uses for Kubernetes Labels is to group resources for object queries. Suppose you have hundreds of pods running in your Kubernetes cluster and you want to identify which pods belong to a particular application or service. With Kubernetes labels, you can easily annotate these pods with labels that represent their associated application or service, and then query for these pods based on these labels.

This use of labels can be particularly useful in a microservices architecture, where you may have many different services running in your cluster. By assigning appropriate labels to each service, you can streamline the process of querying for and managing these services.

Perform Bulk Operations

Another common use case for Kubernetes Labels is to perform bulk operations on a group of Kubernetes objects. Suppose you want to update a configuration setting for all pods running a specific version of your application. By labeling these pods with a specific version, you can easily identify and update these pods in a single operation.

This not only saves you time from having to manually identify and update each pod individually, but it also reduces the risk of error that can occur when performing these operations manually. In a similar vein, you can use labels to delete or scale a group of Kubernetes objects based on a shared characteristic.

Schedule Pods Based on Node Labels

Kubernetes labels can also be used to influence the scheduling of pods onto nodes in your cluster. By assigning labels to your nodes, you can create rules that ensure certain pods are scheduled onto certain nodes. For example, you might label nodes with high memory capabilities as “high-memory”, and then configure your memory-intensive pods to be scheduled onto these nodes.

Learn more in our detailed guide to Kubernetes pod 

Kubernetes Labels vs. Annotations 

While Kubernetes labels and annotations may seem similar, they serve different purposes. 

Labels are used to identify and select objects based on their characteristics. They form the backbone of many core features in Kubernetes, including service discovery and replication control.

Annotations are not used to identify and select objects—instead, they are used to attach arbitrary non-identifying metadata to objects. This metadata can be used by tools and libraries to provide richer functionalities. For instance, an annotation could be used to indicate the last person who modified an object or the deployment strategy used by a continuous deployment tool.

In essence, while labels help in organizing your Kubernetes objects, annotations provide a way to add extra information to these objects without interfering with their identity or role within your system.

Using Kubernetes Labels: Equality-Based and Set-Based 

As mentioned earlier, the Kubernetes API supports two types of selectors: equality-based and set-based. Let’s see how to set labels using each of these types.

Defining Kubernetes Labels

The following YAML manifest defines configuration for a pod running a busybox container. The metadata:labels subfield allows you to define labels for the pod. In this case there are two labels attached: environment: development and tier: frontend.

apiVersion: v1

kind: Pod

metadata:

  name: set-based-selector-demo

  labels:

    environment: development

spec:

  containers:

  - name: test-container    image: k8s.gcr.io/busybox

Using an Equality-Based Label Selector

In Kubernetes, an equality-based label selector is used to filter resources based on exact matches of label keys and values. The following is an example of how to use an equality-based label selector.

Consider the pod whose YAML configuration is shown above, which is labeled with environment: development and tier: frontend. One way to select this pod using an equality-based selector is the following command:

kubectl get pods -l 'environment=development'

This command lists only pods that have the environment label set to development.

Using a Set-Based Label Selector

Set-based label selectors allow more complex querying, like selecting objects based on a set of values for a single label. There are three operators: in, notin, and exists.

For example, to select pods that have the environment label set to either development or testing, you can use the following command:

kubectl get pods -l 'environment in (development,testing)'

This command will list pods that have the environment label set to either development or testing.

Best Practices for Using and Managing Kubernetes Labels 

Here are some best practices that can help you get the most out of your Kubernetes labels.

Use Label Naming Conventions

Using label naming conventions means agreeing on a consistent format for your labels that is used across your organization.

Having a standardized naming convention makes it easier for team members to understand and use labels effectively. It also reduces the risk of errors that can occur when there are inconsistencies in label naming.

Manage Labels via Code

Another best practice for managing Kubernetes Labels is to manage them through code, rather than manually through kubectl commands. 

You can do this by defining and updating labels within your Kubernetes object manifests. This approach aligns with Infrastructure as Code (IaC) practices, allowing you to maintain version control, consistency, and repeatability in your deployments.

Avoid Unnecessary Changes to Labels

While it can be tempting to update labels as your needs evolve, it’s important to avoid unnecessary changes to your labels. This is because changing a label can have wide-reaching impacts on your Kubernetes objects and could potentially disrupt your services.

For instance, changing a label could result in a pod being rescheduled onto a different node, or it could cause a service to lose its association with a set of pods. Therefore, it’s important to thoroughly consider the potential impacts of changing a label before doing so.

Don’t Store Sensitive Information in Labels

It may seem like a convenient place to store sensitive information such as passwords or access keys, but labels are not designed for this purpose. Labels are not encrypted and can be easily accessed by anyone with access to your Kubernetes cluster. Therefore, it’s crucial to avoid storing sensitive information in your labels.

Use Labels for Cost Monitoring

Finally, one overlooked use of Kubernetes Labels is for cost monitoring. By assigning labels to your Kubernetes objects that represent different cost centers (e.g., different departments or projects), you can track the resource consumption of these cost centers and allocate costs accordingly.

This can provide valuable insights into the cost efficiency of different parts of your organization and can help inform decisions on resource allocation and budgeting.

Automating Kubernetes Infrastructure with Spot by NetApp

Spot Ocean from Spot by NetApp frees DevOps teams from the tedious management of their cluster’s worker nodes while helping reduce cost by up to 90%. Spot Ocean’s automated optimization delivers the following benefits:

  • Container-driven autoscaling for the fastest matching of pods with appropriate nodes
  • Easy management of workloads with different resource requirements in a single cluster
  • Intelligent bin-packing for highly utilized nodes and greater cost-efficiency
  • Cost allocation by namespaces, resources, annotation and labels
  • Reliable usage of the optimal blend of spot, reserved and on-demand compute pricing models
  • Automated infrastructure headroom ensuring high availability
  • Right-sizing based on actual pod resource consumption  

Learn more about Spot Ocean today!