Kubernetes Persistent Storage: Efficient and Scalable Solutions

This educational blog presents a high-level overview of Kubernetes storage. For more information on persistent storage for containers start with our LightGUIDE on Kubernetes: Persistent Storage for Containers.

Kubernetes, an open-source platform for automating application deployment, scaling, and operations, has become the de facto standard for container orchestration of modern cloud-native applications. As organizations transition towards microservices architectures and containerized applications, the need for efficient yet highly performant, scalable, and feature-rich Kubernetes storage solutions has intensified. This blog details the requirements for Kubernetes storage solutions, evaluates traditional storage architectures for Kubernetes, explores the benefits of software-defined storage (SDS) with NVMe® over TCP, and presents examples of why the Lightbits enterprise cloud data platform stands out as the ideal choice for companies building Kubernetes cloud services.

What Do Companies Need Today for Kubernetes?

The versatility and power of Kubernetes in handling diverse and demanding applications is what makes it a cornerstone technology in modern IT infrastructure. Its efficiency in managing and handling large-scale data processing makes it an ideal platform for big data and data analytics tasks. 

Recent research, The Voice of Kubernetes Experts Report 2024, indicates that the adoption rate is growing quickly with 80% of organizations predicting that they will build most of their new applications on cloud-native platforms over the next 5 years, and 65% of organizations indicating that they plan to migrate from VMs to containers in the next 2 years. 

What is Kubernetes being used for? The same research indicates it’s increasingly being used to enable data-intensive services, including databases, analytics, and AI/ML workloads.
Kubernetes databases, analytics, and AI/ML workloads

Source: The Voice of Kubernetes Experts Report 2024

When asked what data management capabilities were needed to run Kubernetes, organizations overwhelmingly agreed that a unified platform with high availability were the most desirable features.
Kubernetes features defined in bar graph
Source: The Voice of Kubernetes Experts Report 2024

If you are among those considering building your cloud-native applications on Kubernetes, you’ll need several key infrastructure components to leverage it effectively and to ensure successful deployment and operation. Here’s an in-depth look at the capabilities of storage for Kubernetes that you will want to consider before migrating: 

Kubernetes Persistent Storage: By default, containers in Kubernetes are ephemeral, meaning that any data stored within them is lost when the container shuts down or is terminated. Persistent storage for containers is essential for unlocking the full potential of Kubernetes, enabling you to run a wide range of workloads, including stateful applications, with confidence and efficiency. Many applications require persistent data storage to maintain their state across multiple instances, updates, failures, or restarts; examples include SQL, NoSQL, PostgreSQL, and other databases.

Scalability and Flexibility: You will require a scalable solution that can handle fluctuating workloads efficiently. Kubernetes provides this scalability, but the underlying storage must also scale seamlessly to accommodate growing data volumes and performance demands. Dynamic scaling (in any direction–up, down, or out) ensures optimal Kubernetes storage utilization and cost efficiency. In addition, the best storage for Kubernetes will have the flexibility to manage clusters across different clouds–on-premises, public, or hybrid clouds.  

High Availability and Resilience: Ensuring applications are always available is critical. Kubernetes storage needs to deliver high availability (HA) and data resilience, ensuring that the control plane and worker nodes are highly available and can withstand failures without impacting application uptime. Most Kubernetes storage solutions will have built-in mechanisms to automatically detect failures and redirect traffic to healthy nodes or clusters, ensuring continuous service availability.

Performance: With applications generating and processing vast amounts of data in real-time, storage performance is paramount. Low latency and high throughput are essential to meet the performance requirements of databases, analytics, and AI/ML workloads.

Cost-Efficiency: Most customers tell us that they operate under tremendous budget constraints, which necessitate cost-effective storage solutions. You’ll want to look for Kubernetes storage solutions that offer a balance between performance and cost, avoiding overprovisioning while still meeting performance needs. Leveraging cost-effective software-defined storage that affords the ability to leverage commodity hardware and efficient networking solutions like NVMe over TCP will deliver both performance and cost-efficiency.

Ease of Management: Do not introduce complexity to your infrastructure with Kubernetes storage. Look for Kubernetes storage solutions that simplify management tasks, offer seamless integration with Kubernetes, and provide robust monitoring and automation capabilities. Unified management of the Kubernetes cluster from a single pane of glass will reduce manual intervention and administrative overhead.

Security: Protecting data from breaches and ensuring compliance with regulations is non-negotiable. Kubernetes storage solutions must offer strong Identity and Access Management (IAM) policies to control access to the Kubernetes cluster and its resources, network policies to secure communication between pods and services within the cluster, and data encryption at rest and in transit to protect sensitive information. 

What are the limitations of SAN, NAS, and DAS storage architectures for Kubernetes

Legacy storage systems are not well suited as cloud-native storage for Kubernetes. Why? Legacy Storage Attached Networks (SAN), Direct-Attached Storage (DAS), and Hyperconverged Storage (HCI) systems weren’t built for today’s modern I/O-intensive workloads and as such comes with inherent limitations. These systems can be expensive and complex to procure and manage, can result in underutilized storage resources, have rigid capabilities, and aren’t considered flexible enough to adapt as infrastructure needs change. 

Storage Area Network (SAN)
Storage Area Network (SAN) diagram

SANs are expensive and complex to manage, requiring dedicated hardware, networking infrastructure, and specialized knowledge to maintain. Scaling SANs can be challenging and costly, often requiring significant hardware investments and complex reconfigurations. SANs can suffer from performance bottlenecks due to network latency and bandwidth limitations, which can impact the performance of Kubernetes applications.

Network-Attached Storage (NAS)

NAS devices, connected over standard network protocols, can introduce latency and have limited throughput compared to other storage solutions, affecting the performance of high-demand applications. While easier to scale than SANs, NAS systems still face limitations in scaling efficiently to meet the demands of large, dynamic Kubernetes environments. Traditional NAS setups can have single points of failure, impacting the availability and reliability required for Kubernetes workloads.

Direct-Attached Storage (DAS)

Direct-Attached Storage (DAS) structure

DAS is tied to the specific node it is attached to. When a pod is moved (rescheduled) to a different node, the storage does not move with it making it difficult to maintain stateful applications, as Kubernetes might reschedule pods to different nodes for various reasons (e.g., node failure, scaling events). In Kubernetes, moving application pods with DAS is generally not supported in the same seamless way as using more flexible storage solutions like Persistent Volumes (PVs).  If you have specific performance requirements that necessitate using DAS, you can explore local PVs or node affinity, but be aware of the trade-offs in terms of flexibility and management complexity. In some scenarios, you might use node affinity to ensure that pods are scheduled on specific nodes that have the required DAS, but this approach is restrictive and can limit the flexibility and resilience of your cluster, as it ties pods to specific nodes. DAS lacks inherent high availability and data resilience features, making it less suitable for critical applications that require continuous availability.

Hyperconverged Infrastructure (HCI)
Native-cloud storage

HCI can be problematic in dynamic Kubernetes environments where resource demands fluctuate. In HCI, compute and storage resources are tightly coupled, which can lead to resource contention. High storage I/O demands can impact compute performance and vice versa, affecting Kubernetes workloads that need balanced resource allocation. This coupled scaling can lead to inefficient resource usage and increased costs often leading to the overprovisioning of resources. The integrated storage in these systems may not match the performance of specialized, high-speed storage solutions like NVMe over TCP. This can lead to suboptimal performance for workloads that require high I/O throughput and low latency.

Native-cloud storage also has limitations, such as the performance being insufficient for today’s database, analytics, and AI/ML workloads, performance tiers coming at a high cost, and lacking enterprise-grade resiliency and ala carte data services. Whether you’re building cloud-native applications or refactoring legacy applications, cloud charges can be costly.
Users accessing data from the server through various applications

 

Worse yet, all are packaged for the market in a way that results in vendor lock-in. 

Why is software-defined storage based on NVME over TCP ideal for Kubernetes?

Software-defined storage (SDS) based on NVMe over TCP (NVMe/TCP) offers unparalleled performance and flexibility. NVMe (Non-Volatile Memory Express) is designed for high-speed storage media, delivering low latency and high throughput, crucial for performance-sensitive Kubernetes applications. When combined with TCP/IP networking, it enables these benefits over standard Ethernet networks, eliminating the need for specialized and costly hardware. NVMe/TCP ensures that Kubernetes clusters can achieve the high I/O performance necessary for demanding database, analytics, and AI/ML applications in large-scale services. 

“NVMe/TCP allows organizations to provision scalable storage without having to change their network architecture fundamentally, and provides latencies akin to that provided from conventional direct-attached storage.” – Eric Burgener, VP of Research at IDC 

Furthermore,  both offer significant cost and operational efficiencies. By utilizing existing Ethernet infrastructure, you avoid the high costs associated with proprietary storage networking solutions while still benefiting from NVMe’s high performance. The decoupling of storage from the underlying hardware in SDS allows for dynamic, flexible, and scalable storage provisioning, perfectly aligning with the elastic nature of Kubernetes, where workloads can scale up or down based on demand. This makes it easier to scale storage capacity independently from compute resources, providing a more granular and efficient way to manage resources and reducing the need for overprovisioning. Additionally, SDS solutions often come with advanced features such as automation, robust monitoring, and simplified management interfaces, which are essential for managing Kubernetes clusters. These features help reduce administrative overhead, streamline operations, and improve the overall agility and responsiveness of IT infrastructure, making SDS combined with NVMe/TCP an ideal Kubernetes persistent storage solution. 

SDS platforms offer high availability and resiliency. As mentioned earlier in the blog, The Voice of Kubernetes Experts Report revealed that availability for DR strategies was the most critical capability organizations required from storage for Kubernetes.  SDS platforms are designed with built-in redundancy and failover mechanisms. They can provide high availability and data resilience, ensuring that Kubernetes applications remain operational even during hardware failures.

Cloud-Native Storage for Kubernetes Architecture

Several storage solutions are available for deployment within the Kubernetes ecosystem, but they often suffer from performance issues due to their reliance on replication. Low latency performance is a foundational requirement. The performance should be close to that of local DAS flash drives. While delivering this performance, the solution must also support Kubernetes portability. 

Running storage within the Kubernetes framework alongside applications can introduce the “noisy neighbor” problem, where applications in adjacent pods consume storage resources from nearby pods, impacting overall CPU utilization. This setup means that worker nodes are not 100% available for applications, necessitating additional resource planning and adding complexity to the infrastructure.

The best approach is to deploy a cloud-native storage architecture. This should make some intuitive sense. Many, if not most applications running on Kubernetes are cloud-native. A cloud-native storage solution offers the kind of flexibility that matches the portability and agility of Kubernetes. The issue is performance. For these reasons, utilizing a disaggregated cloud-native storage solution attached via a Container Storage Interface (CSI) plugin is recommended for optimal Kubernetes storage performance. The goal is to achieve local flash drive-caliber storage performance within Kubernetes by using a disaggregated, dedicated storage framework that is software-defined, fault-tolerant, and supports essential data services like thin provisioning, snapshots, and clones.

Lightbits -the Best Storage for Kubernetes

Lightbits persistent storage for Kubernetes makes it possible to enjoy the advantages of Kubernetes portability without compromising storage performance or scalability. Lightbits delivers low-latency performance that is as close as possible to local flash. The NVMe/TCP software-defined storage solution can support hundreds of Kubernetes clusters from a single Lightbits storage cluster without inhibiting the Kubernetes container portability model.

Portability comes from Lightbits’ disaggregated architecture. Lightbits uses CSI to allow Kubernetes pods to move between servers on the network. If a Kubernetes pod moves to a new physical server, the same Lightbits persistent volume will be attached to that new machine and remain available to the pod. Additionally, in contrast to other Kubernetes storage solutions, a single Lightbits cluster can support multiple container orchestration platforms, including Red Hat OpenShift Container Platform (OCP), VMware Tanzu, OpenStack, and VMware vSphere, as well as bare-metal applications.

In architectural terms, Lightbits’ persistent storage for Kubernetes features a powerful control plane that is designed to function in an asynchronous mode—handling thousands of requests at any given moment. This approach is suited to large-scale Kubernetes clusters that must handle Persistent Volume Claims (PVCs) that are created, modified, or deleted by the thousands every day.

Lightbits cloud-native storage for Kubernetes topology

Lightbits cloud-native storage for Kubernetes topology.

Kubernetes Storage Benchmarks

The metrics reveal Lightbits’ capabilities to deliver up to 4M IOPS in a Kubernetes environment; delivering up to 4X higher throughput for 4K reads, along with up to 17X more throughput for 4K writes and up to 13X more throughput for 8K read/write workloads. For 32K read/write workloads the solution provides up to 6X more throughput. The solution also scales efficiently, which is critical for success with Kubernetes.

I/O performance comparison in 12 node Kubernetes environment

I/O Performance Comparison in 12 Node 12 node Kubernetes Environment (Identical hardware configuration and same number of nodes)

Cloud Services Built for Kubernetes – Block Storage

By choosing Lightbits software-defined storage, you ensure that your Kubernetes environments are robust, scalable, and ready to meet the demands of modern applications, ultimately driving greater efficiency and innovation for your organization.

Kubernetes Platform as a Service, Powered by Lightbits

metalstack.cloud, a cloud-native offering from x-cellent technologies, is designed to deliver high-performance, secure, and compliant cloud infrastructure. This bare metal-managed Kubernetes platform caters to the German SMB market, ensuring GDPR compliance and superior performance. The Lightbits cloud data platform plays a critical role in providing high-performance, efficient, and persistent storage for this architecture. This technical collaboration enables fast, resilient, and secure services, setting a new standard for cloud-native services with high data protection and performance.

“ No other storage solution had the same price/performance ratio and tight integration with Kubernetes. The achievable low latency is outstanding.” – Stefan Majer, CTO at x-cellent technologies GmbH

Meeting the needs for multi-tenancy security and performance, Lightbits offers exceptional latency and throughput, eliminating the need for additional network technologies. Its software-defined, disaggregated storage solution supports high performance and scalability, preserving Kubernetes portability and agility.

Customer benefits from implementing Lightbits persistent storage for containers

  • High Performance: exceptional latency and throughput
  • Resiliency: ensures data resiliency
  • Cost-Efficiency: competitive pricing for high-performance storage

Continue your learning, read the customer case study: metalstack.cloud Delivers Kubernetes as a Service with Lightbits

eCommerce Giants Built Kubernetes Real-time Data Platforms

One of the world’s largest online retailers, boasting over 80 million products and 7.5 million registered users, faced the challenge of modernizing its IT infrastructure. They sought to transition from JBOD and JBOF designs to a fully disaggregated, software-defined model to support their extensive Kubernetes environment. Their high-performance database applications required a solution that offered high performance and low latency, scalability, and efficient utilization.

“Lightbits gives us a flexible, agile, and efficient storage platform that enables the Kubernetes portability we need without compromising on storage performance or scalability.”

The eCommerce giant chose Lightbits Cloud Data Platform for its high-performance, low-latency, and scalable storage solution. Lightbits leverages NVMe/TCP to enhance database workloads while simplifying operations and reducing costs. The platform supports Kubernetes portability, allowing workloads to move seamlessly across clusters, and improving uptime and availability.

Customer benefits from implementing Lightbits persistent storage for containers

  • Flexibility: runs on any cloud or hardware configuration with CSI plugin providing seamless Kubernetes integration
  • Scalability: independent scaling in any direction
  • Efficiency: resilient storage at local flash performance
  • Agility: the ability to move, shift, and allocate storage to any cloud

Continue your learning, read the customer case study: Lightbits Powers One of the Worlds Largest eCommerce Platforms

Conclusion

In today’s dynamic and demanding IT landscape, Kubernetes has become a pivotal technology for deploying and managing containerized applications. However, the success of Kubernetes deployments hinges significantly on the underlying storage infrastructure. Traditional storage architectures like SAN, NAS, and DAS present limitations in scalability, performance, and manageability that make them less suitable for modern Kubernetes environments.

Software-defined storage, particularly with NVMe over TCP, addresses these limitations by offering a high-performance, scalable, and cost-effective solution. Lightbits Labs SDS exemplifies these benefits, providing an ideal storage solution for Kubernetes. It combines the high performance of NVMe, the flexibility of SDS, and the cost-efficiency of TCP/IP networks, making it the best storage for Kubernetes deployments.

Additional learning materials:

About the Writer: