Kubernetes And Disaggregated Storage: A Perfect Match

Kubernetes and containerized applications are increasingly popular for their efficiency, scalability and ease of deployment. Containers got their start for stateless microservices used to build cloud-native applications, but their allure has spilled over to other areas that are stateful and require flexible, high-performance storage. The challenge is deploying storage solutions for containers that meet the applications’ performance requirements while preserving the philosophy of containers being unbound from server physicality.

The Growth Of Kubernetes

The Cloud Native Computing Foundation (CNCF) 2019 survey results show containers are gaining in popularity, with 84% in production. That’s up from 23% in the CNCD’s inaugural 2016 survey. The survey states new serverless, service mesh and storage projects are growing to manage or work within containers. And, cloud-native projects in production are gaining steam.

With Kubernetes, those adopting container practices can receive similar server efficiencies associated with virtualization. Very often, containers are run inside virtual machines that combine the modularity of containers with the rich, mature network security and composability features of virtualization. This also means organizations can run containers and virtual machines on the same platform.

Kubernetes has grown beyond simple microservices and cloud-native applications. There is an increase in deploying machine learning and AI applications via platforms such as the Kubeflow project and NVIDIA NGC. Containers are making strides across a wide variety of applications and will likely continue to be more and more widely deployed.

The Age Of Storage For Containers

More applications moving to Kubernetes environments creates a greater need for storage that’s compatible with this flexible, dynamic environment. As organizations spin up database applications using Cassandra, MongoDB, CockroachDB, MySQL and others, the need for persistent storage for these applications is growing. The storage options for Kubernetes containers fall into two main categories: stateless or stateful. Stateless applications make use of ephemeral (temporary) storage. Stateful applications have workloads that need their storage to persist across container or POD stops, restarts and movement (to other physical servers).

Today, Kubernetes usage is growing in areas such as cloud-native, scalable applications that require stateful storage. While applications built on MongoDB, Cassandra and Apache Spark can protect their data via replication, utilizing direct-attached storage (DAS) results in unwanted network rebuilds for planned or unplanned Kubernetes POD movement. To overcome this, stateful applications will turn to storage outside the Kubernetes clusters, such as network-attached storage or remote storage.

Storage Options For Stateful Applications

In late 2018, Kubernetes implemented Container Storage Interface (CSI) plug-ins to address the growing need for third-party storage support for stateful storage needs. The CSI plug-ins enable third-party storage providers to support a block or file storage system by extending the Kubernetes volume interface. Block-based technologies tend to be the fastest and lowest latency and are preferred for highly transactional workloads. Three-block storage technology solutions support CSI plug-ins: direct-attached storage (DAS), storage area networks (SAN) and NVMe over fabrics (NVMe-oF).

  • DAS provides high performance and low latency; however, this approach results in high operational overhead, poor flash utilization and long recovery times that are detrimental to the network.
  • SAN provides ease of deployment but does not measure up to DAS in performance and suffers from much higher latency.
  • NVMe Over Fabrics (NVMe-oF) provides efficiency and data protection abilities like SAN, while offering DAS-like performance in latency and overall performance. In particular NVMe/TCP, which is part of the NVMe-oF standard, enables disaggregation of storage from compute over standard networks so that compute servers can share a remote cluster of NVMe storage that performs like local flash. Of course, this does add some complexity in that storage traffic that was internal to the application server now goes across a network. Every environment will be different, and this trade-off of utilization efficiency would have to be weighed against potential infrastructure changes in any NVMe-oF implementation.

Why Disaggregation And NVMe-oF Matter For Containers

With the ease of operation of SAN and high-performance of DAS, NVMe-oF solutions are ideal for many enterprises. If you have a custom application written to use a certain hardware configuration efficiently, where the data is ephemeral or protected in some other fashion, you may not benefit from disaggregation with NVMe-oF.

Likewise, NVMe-oF is simply not available everywhere yet (such as on Microsoft Windows), and you may be forced to stay with SAN, at least for now. While NVMe-oF is available on Fibre Channel and InfinBand, it is most easily deployed on ethernet with NVMe/TCP. Utilizing this ubiquitous fabric goes hand in hand with the Kubernetes philosophy of portability.

Disaggregation via NVMe-oF provides the benefit of separating compute and storage, which maximizes Kubernetes functionality, provides ease of migration and delivers fast recovery times. NVMe/TCP acts like local flash by delivering low latency and high performance, while at the same time providing high availability so there is no interruption in service if an individual Kubernetes application migrates or goes down.

Kubernetes Storage Tips

As organizations determine their Kubernetes storage needs, IT teams should do the following:

  • Identify current compute and storage needs and how the organization plans to scale these needs.
  • Determine the split of stateful versus stateless storage to determine overall storage needs.
  • Understand the cost of a disruption to the organization and plan to address disruptions with an easily deployable backup that delivers fast recovery times if the container or entire server goes down.

As Kubernetes applications deploy at a rapid rate, the need for persistent, stateful, high-performance storage is growing. Organizations that need to protect their assets rely on fast recovery times and/or failover when a container or entire server goes down. CSI plug-ins for NVMe-oF storage solutions provide organizations with high-performance, persistent and resilient options that allow them to meet the toughest SLAs in Kubernetes environments.

Article originally published on Forbes, Forbes Technology Council, here.