Celebrate Pi Day by Taking a Bite out of Volumes

By Mike Metral
Mon, Mar 14, 2016 using tags KubeCon , Volumes , Flocker

   Vote on Hacker News  

Happy Pi Day Kubernauts!

In this segment, we will be celebrating & honoring Pi Day by enriching your understanding of the various Volumes types that Kubernetes supports, and how they compare amongst one another.

Coming fresh out of KubeCon in London last week, we also want to give you a quick update as to how the conference unfolded and how this pertains to our deep-dive into Volumes.

Without further ado, lets take a bite out of Kubernetes Volumes (and pie) to celebrate not just mathematics, but technology too!



Outline

KubeCon Wrap-Up

KubeCon in London was a great success and there is no doubt that all attendees walked away not only impressed with the talks presented, but also with a valuable insight into what community really means.

But don’t just take our word for it, here’s actual feedback from your fellow community members, and more can be found here and here!



In case you weren’t able to make it to KubeCon, check out the Day #1 and Day #2 summaries we put together. These two posts should hold you over until the official videos get edited & published in the upcoming weeks.

In one of the KubeCon talks, My Microservices are not all stateless! by Kai Davenport of ClusterHQ, there was a general overview of the various volumes supported by Kubernetes and how Flocker aimed to solve a lot of the data-oriented usecases in Kubernetes. This talk sets the stage for a deep-dive on volumes that we’ve been brewing up for a while now!

Volumes

Overview

Kubernetes supports several Volume types, some of which are representations for specific Volume backends, i.e. NFS, and can even span to volume management systems such as Flocker.

Each Volume type not only differs from one another, but brings forth a utilization and interaction with Kubernetes that must be examined & handled on a per-case basis.

It is recommended to use a Kubernetes object abstraction as seen with PersistentVolumes and PersistentVolumeClaims, as they not only absolve the Pod of the backend volume’s configuration details, but facilitate its consumption from within the Kubernetes cluster. Keeping these points in mind, lets examine the Volumes as a whole.

To simplify, and visualize, the various Volume types supported, here’s a chart defining, describing and showcasing the features each type of volume backend offers:



(click to view Google Sheet)

Volume Categories

In order to truly gain an understanding of the numerous Volume types, we’re going to compartmentalize the types using some unofficial, but easy-to-understand, categories.

These categories are reflected in the chart above, so refer to the respective grouping when examining each of their details below.

Local Node Storage

The Local Node Storage category represents a volume that Kubernetes directly implements on the Node hosting the Pod(s). This volume category is defined as such:

A Kubernetes volume…has an explicit lifetime - the same as the pod that encloses it. Consequently, a volume outlives any containers that run within the Pod, and data is preserved across Container restarts.

Of course, when a Pod ceases to exist, the volume will cease to exist, too. Perhaps more importantly than this, Kubernetes supports many type of volumes, and a Pod can use any number of them simultaneously.

The volume can either be non-persistent or persistent, depending on the type used, and is shared across ReplicaSets.

Examples include: Empty Directory, Host Path, git Repo, Secret

Networked Storage

The Networked Storage category represents a volume that Kubernetes does not directly implement on the Node hosting the Pod(s). This storage backend is independently implemented by your cluster administrator, and accessed through a network connection.

The volume is persistent, and depending on the actual type, it can be shared across ReplicaSets.

Examples include: NFS, iSCSI, GlusterFS, Ceph, Fibre Channel

Cloud Provider Storage

The Cloud Provider Storage category represents a volume that Kubernetes does not directly implement on the Node hosting the Pod(s). This storage backend is implemented by a cloud provider for its user, and attached to the VM where the Kubernetes Node is running.

The volume is persistent, and depending on the actual type, it can be shared across ReplicaSets.

Examples include: AWS & GCE block devices.

Storage Utilities

The Storage Utilities category does not represent an actual volume, rather, a volume management utility that Kubernetes depends on. This management utility is independently implemented & managed by your cluster administrator.

The volume’s persistence and sharing capabilities depend on the actual volume backing the configuration that the utility is managing.

Examples include: Flocker.

Special Cases

Utilities themselves have their own set of rules that you must adhere to, and could present distinct results for Pods depending on the actual volume backend configured.

For example, in the case of a utility such as Flocker, the volume backend it supports not only dictates its implementation, but how Kubernetes interacts with it.

Specifically, if Flocker is configured with Ceph, then Pods across a ReplicaSet, where the Pods span multiple Nodes, could share the volume backend if it is setup to allow for it. However, if Flocker is configured with an AWS EBS block device, AWS only allows an EBS block device to be attached to a single Node/VM. This in turn means that Pods across a ReplicaSet, where the Pods span multiple Nodes, would not work, because the Pods cannot share the volume backend due to the single-machine attachment restriction that AWS is imposing – that is, unless all the Pods are on the same exact Node.

Nuances such as the one previously mentioned, need to be analyzed and evaluated on a case-by-case basis when making use of a storage utility to manage your volume backends.

Misc.

The Misc. category represents a volume abstraction in Kubernetes for the Pod to consume. The abstraction is created before the Pod is instantiated.

The persistence and sharing capabilities of this abstraction depends on the configuration of the actual volume that the abstraction represents.

Example: PersistentVolumeClaim

Conclusion

The work around Volumes in Kubernetes is actively being improved upon, and is also modular enough that you can create your own type as a plugin, if it currently does not exist.

Working with data in containers, such as with message queues & databases, which happen to be stateful, is still quite an anomaly as the confidence in their persistence and resiliency is still immature. This perception is sure to evolve as storage utilities such as Flocker grow and are given management over your Volumes.

However, it must stated that the best of mode of operation when it comes to data that Kubernetes Pods depend on, is the data that the Pod brings along with it. What this means, is that if you construct your Pod to be stateless, i.e. it either relies on a sidecar container to initialize its state from an outside source, or the Pod itself pulls the data it needs & pushes it out once its done, is proven to be the simplest model to interact with Kubernetes today.

We hope that you’ve gained a deeper insight into the various Volumes you can utilize, and that you have a great Pi Day!


   Vote on Hacker News