Kubernetes Series Learning Note 6: Storage

May 11, 2020 3-minute read

After the topic of security of Kubernetes, in this blog, I will cover the storage for Kubernetes.

Volume

Before we jump into Kubernetes, I would like to refresh the memory of Docker Volume concept. Normally, Docker doesn’t run in a long term. When the docker container vanishes, the data generate during the container running will be lost. To hold the data or files generated by docker containers, here comes the concept of Volume. A Volume can be treated as a place somewhere locally, which is a mount of a workspace inside the docker container so that every files or data generated in the workspace of the container, the local place will also be written with the files or data as well.

In Kubernetes, we can also attach a volume to a pod. For example, in a pod definition file, we can specify the volumes section under the spec as follow.

spec:
    containers:
    - image:
      name:
      command:
    volumes:
    - name: 
      hostPath:
        path: /data
        type: Directory

But normally, we have multiple pods in a cluster. If we use this way, each node will have their own data generated by the pods inside themselves, which means we can’t hold data or files in one place. With this situation, there are external storage solution to help with this issue and make sure all files or data generated by the cluster are gathered into one place. For example, we can use AWS Elastic Block Store.

volumes:
    - name:
      awsElasticBlockstore:
        volumneID: <volumne-id>
        fsType: ext4

Persistent Volume

In Kubernetes, there are bunch of pods and applications across nodes and each of them requires the storage somehow. To manage the storage centrally, we bring in the Persistent Volume. We can consider it as an area of land and will be used for planting a specific type of crop(pods). The sample of the definition of the Persistent Volume is as follow.

apiVersion: v1
kind: PersistentVolume
metadata: 
    name: pv-volume
spec:
    accessModes:
      - ReadWriteOnce
    capacity:
      storage: 1Gi
    hostPath:
      path: /tmp/data
    # we can replace with external storage solution as well
    # similar with the aws EBS definition as the preivous section

To declare what kind of crop will be planted to a preserved area, we say, which pods will use which persistent volume in other words, we bring in the Persistent Volume Claim. Every persistent volume claim is binding a persistent volume. Kubernetes binds those according to the conditions like sufficient capacity, access mode, storage class etc. Also the selector is available to bind the PV an PVC directly. If there is no more PV available, the upcoming PVC will remain in pending status until a PV get released. The sample of a Persistent Volume Claim is as follow,

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
    name: myclaim
spec:
    accessModes:
      - ReadWriteOnce
     resources:
        requests:
          storage: 500Mi

Also there are three types of persistent volume reclaim policies.

persistentVolumeReclaimPolicy: Retain: When a PVC is deleted, the binding PV has to be deleted manually. In the meantime, the PV will not be used to match new PVC.
persistentVolumeReclaimPolicy: Delete: When a PVC is deleted, the binding PV will be deleted automatically.
persistentVolumeReclaimPolicy: Recycle: When a PVC is deleted, the binding PV can be reused by other new PVC.