Many applications you might wish to deploy on a Kubernetes cluster require persistent storage in the form of PVCs (persistent volume claims) and PVs (persistent volumes). As the lifetime of a pod is ephemeral, and app data is not normally built into a container image (separation between app and app data), the data that users generate while a pod is running must be stored elsewhere. Many cloud-native applications, such as those that were developed as Twelve-Factor Applications and built as microservices, may use modern methods such as an external key-value store (e.g. Redis), database (e.g. MongoDB or PostgreSQL), or object storage (e.g. S3) to store such data so that the code in the container can be immutable. On the other hand, many applications that pre-dated containers and Kubernetes rely on an application directory which might separate app code and app data less cleanly than that. For proper operation, those applications assume that each replica of the application has the same view of the app directory at any given time, even if each replica (pod) is running on a different Kubernetes node.
This is where a ReadWriteMany (RWX) persistent volume comes into handy. By default, most Storage Classes in Kubernetes support only the ReadWriteOnce (RWO) mode, meaning that they can only be written to by pods residing on the same Kubernetes node. This is clearly insufficient for deployments where replicas spanning across multiple Kubernetes nodes need to write to the persistent volume. RWX capable storage classes make use of protocols such as NFS and clustered filesystems such as OpenZFS, GlusterFS (deprecated), and CephFS to allow multiple pods on multiple nodes to write to a PV without data corruption.
Even if all the applications you wish to run on Kubernetes are cloud-native and use modern ways of persisting data such as a key-value, document, or relational database, or object storage, you might still require a ReadWriteMany (RWX) storage option if you choose to run your database in Kubernetes. For a database use case, it is worthwhile to choose the highest performance option, such as NetApp ONTAP which is optimized for random I/O. Some distributed databases like Redis, don’t require RWX and can simply use a separate RWO PV for each replica, and may be more suited to your requirements. As a general rule, it is usually wiser to run the database entirely outside of Kubernetes though. Indeed, using a managed DB that automatically replicates your data between the primary and replica nodes (in a single or multi-AZ setup) with failover, takes many of the headaches out of operating databases such as MySQL or PostgreSQL.
This article provides an overview of the ReadWriteMany (RWX) persistent storage options available for Kubernetes in the cloud landscape today. These include fully managed solutions (e.g. Amazon EFS and FSx) to solutions that you operate yourself (e.g. TrueNAS and Rook CephFS). Particularly for web applications (with a webroot and data directory) that cannot be easily re-factored to break user data out into a database or object storage, using RWX-capable PVs can greatly ease your journey to Kubernetes.
Example Use Case – Nextcloud on Kubernetes with RWX PV
An example of an application we recommend using a RWX persistent volume for is Nextcloud on Kubernetes. For Nextcloud, even though the app and data directory can be assigned to separate PVs, all replicas of the Nextcloud pod must be able to update the config and app directories during operation – necessitating the use of RWX PVs for both app and data. Without this, administration settings cannot save, and apps cannot be added or removed through the app store. Often, including by the Nextcloud maintainers themselves, using 100% object storage (both for stored user data and the core’s static assets) is touted as the solution for deploying Nextcloud on Kubernetes. In our experience working with numerous clients, the performance of Nextcloud slows to a crawl if the static assets are stored on object storage and not on an actual volume.
Many storage orchestrator solutions have came and went since the nascent rise of Kubernetes, including OpenEBS, Longhorn, and StorageOS. These solutions have largely given way to Rook as the standard that is recognized by the Linux Foundation and the broader Kubernetes community. Rook is built on Ceph, the leading clustered file system in use by major cloud providers themselves to provide replicated block storage. The Rook project includes three separate Kubernetes CSI drivers for persistent volumes:
- RBD CSI driver – This uses the RADOS Block Devices (RBDs) of the Ceph cluster directly and supports only the ReadWriteOnce (RWO) mode.
- CephFS CSI driver – CephFS is a relatively newer feature introduced in Ceph Pacific release and later, supporting ReadWriteMany (RWX) mode.
- NFS CSI driver – Exports the Ceph block volumes as NFS shares for use by any application that supports NFS mounts.
OpenEBS and Longhorn were open source solutions that somewhat faded into obscurity as other Kubernetes distributions became more popular than Rancher, and StorageOS was a software-defined storage platform for Kubernetes licensed by storage capacity ($/TB/month). We would not typically recommend adopting OpenEBS or Longhorn at this time as both projects are in maintenance mode, receiving mostly patches and bug fixes as opposed to features for new releases of Kubernetes.
- Deprecation announcement for legacy OpenEBS projects https://github.com/openebs/openebs/issues/3709
- The docs for Ondat (the parent company of StorageOS) are still accessible at https://docs.ondat.io/ but the company website https://ondat.io/ is no longer available.
Considering the storage option that you will use with your Kubernetes applications is essential to a successful deployment. If you are looking at moving an application to Kubernetes and are not sure which type of managed storage will provide the robustness, security, and performance you need, contact our cloud architects for advice and implementation support on any cloud provider.
RWX Storage Options for Kubernetes Persistent Volumes
NFS
Pros: Very stable and mature NFS CSI driver for RWX persistent volumes
Cons: NFS daemon *not* highly available unless using failover with Corosync/Pacemaker. Need to handle data replication between NFS nodes yourself with solution like DBRD.
Object Storage
Pros: Relatively inexpensive, a simple API, and data stored is durable. Do not need to pre-provision amount of storage space; the bucket grows as you add items to it.
Cons: Does not offer a POSIX compliant filesystem. Connection with Fuse CSI driver can be unstable (“transport endpoint is not connected” and similar errors). Relatively slow for reading small files. Difficult to backup/restore data if application obfuscates data in OIDs (e.g. Nextcloud using objects as primary storage does this).
Amazon EFS, Azure Files, Google Filestore, OCI File Storage
Pros: Fully managed. Easy to create and attach as K8S RWX persistent volumes using NFS CSI driver.
Cons: EFS performance inferior to OpenZFS. Lock-in to provider specific service. Moderately expensive $/GB but less truly an “enterprise” solution compared to NetApp ONTAP.
- Amazon EFS – https://aws.amazon.com/efs/
- Azure Files – https://azure.microsoft.com/en-us/products/storage/files
- Google Filestore – https://cloud.google.com/filestore?hl=en
- OCI File Storage – https://www.oracle.com/cloud/storage/file-storage/
OpenZFS (TrueNAS, Amazon FSx for OpenZFS, OVH HA-NAS)
Pros: Uses the tried-and-tested ZFS storage backing (very stable). A self-hosted implementation (TrueNAS) can be deployed on dedicated and on-prem servers. Easy to create and attach as K8S RWX persistent volumes using NFS CSI driver.
Cons: Need to manage HA yourself if using TrueNAS. Most implementations have comparatively slower I/O speeds compared to NetApp ONTAP.
- TrueNAS – https://www.truenas.com/truenas-community-editions/
- Amazon FSx for OpenZFS – https://aws.amazon.com/fsx/openzfs/
- OVH HA-NAS – https://www.ovhcloud.com/en/storage-solutions/nas-ha/
NetApp Cloud Volumes ONTAP (Amazon FSx for NetApp ONTAP, Azure NetApp Files, Google Cloud NetApp Volumes, OVH Enterprise File Storage)
Pros: High IOPS for I/O bound operations. Easy to create and attach as K8S RWX persistent volumes using NFS CSI driver. ONTAP supports iSCSI as well but Kubernetes CSI support is limited.
Cons: Relatively expensive $/GB. Lock-in to reliance on NetApp technology. No at-rest encryption with customer key at some providers (an OVHcloud pre-sales engineer confirmed this for OVH EFS).
- Amazon FSx for NetApp ONTAP – https://aws.amazon.com/fsx/netapp-ontap/
- Azure NetApp Files – https://azure.microsoft.com/en-us/products/netapp
- Google Cloud NetApp Volumes – https://cloud.google.com/netapp-volumes?hl=en
- OVH Enterprise File Storage – https://www.ovhcloud.com/en/storage-solutions/enterprise-file-storage/
Rook Storage Orchestrator for CephFS
Pros: Cloud agnostic (does not depend on a managed service). Maximum control. Supports LUKS or OSD at-rest encryption with a custom key. Excellent Kubernetes support for RWX persistent volumes with Ceph CSI driver.
Cons: More complex initial set up. The Rook deployment and Helm chart needs to be kept up-to-date periodically by the Kubernetes administrator.
- Rook Open-Source, Cloud-Native Storage for Kubernetes – https://rook.io/
We provide Ceph implementation services on the public cloud or bare metal.