
In the upcoming weeks, I will be writing a series of blogs covering Stateful Applications running on Kubernetes.
- #1 – This blog; Introduction to Stateful Applications on Kubernetes
- #2 – Storage Classes & Dynamic Provisioning
- #3 – StatefulSets & PodDisruptionBudgets
Before we begin, It is important to get an understanding of the terms Pod
, Volume
, Persistent Volume
and Persistent Volume Claim
.
Pod
A pod
is a group of one or more containers (such as Docker containers), with shared storage/network, and a specification for how to run the containers.
Volume
When running multiple containers together in a Pod it is often necessary to share files between those containers; volumes
are used for this purpose.
On-disk files in a container are ephemeral. when a container crashes, it is restarted, but the files will be lost – the container starts with a clean state.
Persistent Volume
A PersistentVolume
(PV) is a piece of storage in the cluster. It is a resource just like a node is a cluster resource.
PVs are volume plugins like volume
s but have a lifecycle independent of any individual pod that uses the PV.
Volume vs. Persistent Volume
Volumes
and Persistent Volumes
are similar in nature to Ephemeral disks and EBS volumes.
Volumes will be deleted when Pods are deleted, while Persistent Volumes are different entities, completely decoupled from the Pod. PVs are managed by a different set of APIs \ kubectl
than Pods
and have their own Lifecycle.
Volumes Internals
- To use a
volume
, apod
specifies what volumes to provide for the pod (thespec.volumes
field) and where to mount those into containers (thespec.containers.volumeMounts
field). - A process in a container sees a filesystem view composed of their Docker image and volumes. The Docker image is at the root of the filesystem hierarchy, and any volumes are mounted at the specified paths within the image.
- Volumes can not mount onto other volumes or have hard links to other volumes. Each container in the Pod must independently specify where to mount each volume.
Persistent Volume Claims
A PersistentVolumeClaim
(PVC) is a request for storage. It is similar to a pod
. While Pods consume node resources, PVCs consume Persistent Volume resources. Pods can request specific levels of resources (CPU and Memory). Claims can request specific size and access modes (e.g., can be mounted once read/write or read-only).
Tying it all together
In order to demonstrate how PersistentVolume
s work, let’s use a real-life example of setting up a Database container on a Pod, and mount it’s data volumes under a Persistent Volume.
Persistent Volume definition: database-pv.yml
kind: PersistentVolume apiVersion: v1 metadata: name: database-pv labels: type: amazonEBS spec: capacity: storage: 5Gi accessModes: - ReadWriteOnce awsElasticBlockStore: volumeID: vol-123abcd fsType: ext4
$ kubectl create -f database-pv.yml persistentvolume “database-pv” created
[alert type=”info”] Storage can be mounted by only one node for reading/writing[/alert]
PVC definition: accesing 5Gi
in ReadWrite
mode – database-pvc.yml
kind: PersistentVolumeClaim apiVersion: v1 metadata: name: database-pvc labels: type: amazonEBS spec: accessModes: - ReadWriteOnce resources: requests: storage: 5Gi
$ kubectl create -f database-pvc.yml persistentvolumeclaim "database-pvc" created
As a design pattern, now let’s create a Deployment
resource. Eventually creating a Pod
that uses the previously created PVC
which claims the PersistentVolume
.
[alert type=”info”] Key parts here are:
* volumeMounts
define which volumes are going to be mounted. /app/database
is the directory where the database Server stores all the data.
* volumes
define different volumes that can be used in this RC definition[/alert]
apiVersion: apps/v1 kind: Deployment metadata: name: database spec: selector: matchLabels: app: database replicas: 1 template: metadata: labels: app: database spec: containers: - name: database-pod image: repo/database volumeMounts: - mountPath: "/app/database" name: database-pvc ports: - containerPort: 3306 volumes: - name: database-pvc persistentVolumeClaim: claimName: database-pvc
$ kubectl create -f database-rc.yml replicationcontroller "database" created
All Set.
Now, in case we delete the database pod || pod gets destroyed from any reason, k8s will automatically re-create it, and the storage will still exist with the same data attached to the container, with the last bit that was written to the filesystem.