When evaluating OpenShift Virtualization storage, the most critical decision is choosing a storage architecture that delivers consistent low latency, supports Kubernetes-native operations, and scales with both VM and container workloads.
The right storage solution directly impacts VM performance, live migration success, and overall platform uptime.
What Storage is Best for Running VMs on OpenShift Virtualization?
For most production-scale environments, the best storage is high-performance, low-latency block storage, ideally NVMe-based software-defined storage such as Lightbits LightOS block storage for OpenShift Virtualization. While ODF/Ceph and SAN solutions are viable, NVMe over TCP delivers higher IOPS, lower latency, and better scalability and hardware efficiency for production VM workloads.
While you can technically use any storage with a certified CSI driver, consider how the storage system will perform and scale to meet your requirements. The 3 primary options are:
| Storage for OpenShift-V | Pros | Cons | Use Case |
|---|---|---|---|
| ODF/Ceph | The “easy” button, natively integrated into Openshift-V; Built-in replication | High CPU/RAM overhead on worker nodes means more storage costs for high-performance workloads | Small-to-mid environments prioritizing simplicity |
| Legacy SAN/NAS | Use the hardware you already own | Proprietary hardware lock-in; performance is limited by the efficiency of the specific CSI driver | Transitional environments; need to leverage existing hardware investments |
| Modern, Lean NVMe over TCP Stacks | Highest performance, lowest latency | The organization must be committed to NVMe storage | High-performance, latency-sensitive workloads (databases, high-scale VM density) |
While Ceph is a solid choice for general-purpose workloads where unified object/file/block access or lower performance requirements are acceptable, your mission-critical, latency-sensitive workloads—such as SQL, NoSQL, and Vector databases—demand a specialized engine.
By integrating Lightbits alongside Ceph in OpenShift-V environments, you gain the ultimate architectural flexibility. Our software-defined approach allows you to standardize your hardware procurement on a single server specification while deploying the right storage protocol for the right job. To learn more about how Lightbits compares to Ceph storage or how to augment your Ceph environment, download the white paper.
How does OpenShift Virtualization Use Kubernetes Persistent Storage?
To evaluate storage for OpenShift Virtualization, you must understand its departure from the “datastore” model: it replaces the bulk-container approach with a disk-as-a-resource model. OpenShift-V uses PersistentVolumeClaims (PVCs) to represent VM disks—essentially treating storage as an API-driven resource instead of a static datastore. Think of it as replacing a shared storage pool with on-demand, policy-driven virtual disks that can be dynamically created, resized, and moved. Through the Containerized Data Importer (CDI), VM images are converted into Kubernetes VM storage, enabling consistent management, snapshots, and policies across both VMs and containers using CSI drivers.
What are the Requirements for High-Performance VM Storage on OpenShift?
High-performance VM storage requires:
- Block volume mode (volumeMode: Block)
- Low and consistent latency
- ReadWriteMany (RWX) support for live migration
- Storage QoS to prevent noisy neighbor issues
- High IOPS and throughput under load
- No performance degradation during node failure or rebuild events
A one-size-fits-all list of requirements doesn’t work when you’re evaluating a platform for everything from a Linux web server to a SAP HANA instance. The specific workload’s I/O profile should be mapped to the right storage architecture. For example, database and high-transactional workloads, such as SQL and Oracle, are latency-sensitive. A microsecond of delay in a storage write can cascade into a bottleneck for the entire application.
Performance in a virtualized Kubernetes environment is measured by more than just raw IOPS. For the database example, the performance tier required is NVMe-backed block storage with high sustained IOPS and consistently low tail latency. In this case, you don’t just care about average speed; you need to ensure the slowest 1% of transactions don’t stall the database.
How Do You Migrate VM Workloads to OpenShift Virtualization Storage?
When migrating from VMware vSphere to OpenShift Virtualization, the mindset is one of strategic platform evolution and the elimination of friction points. An organization must evaluate not just how to move data, but also how the underlying storage architecture affects how VMs are managed. Here are the critical considerations for migrating VM workloads to OpenShift-V storage.
- The primary tool for this transition is the Migration Toolkit for Virtualization (MTV). It is a free, Red Hat-supported operator that connects directly to your VMware vCenter. MTV automates the conversion from VMware’s .vmdk format to the Kubernetes-native QCOW2 format used by OpenShift.
- Key Consideration: Ensure your storage target supports CSI snapshots. MTV uses these to perform “warm migrations,” which keep the source VM running while data copies, minimizing the final cutover window.
- Map your VMware storage policies to OpenShift StorageClasses.
- Key Consideration: In OpenShift-V, thick provisioning is replaced with thin provisioning, enabling much higher storage density and cost savings.
- A common bottleneck in VMware migrations is the network between the legacy vSphere environment and the new OpenShift-V cluster. If you are migrating 50TB+ of data, your OpenShift storage ingress must be able to handle the write load without impacting production containers already running on the cluster.
- Key Considerations:
- Evaluate whether you need a dedicated migration network in OpenShift to ensure that moving large VM disks doesn’t introduce latency for your existing applications.
- Plan for Parallel DataVolume transfers. storage backend write-throughput validation and network saturation testing. Without this, migration windows can expand significantly, impacting production workloads.
- Key Considerations:
- Day 2 operations in OpenShift may look a bit different than VMware. OpenShift makes it much easier to expand a disk. You simply edit the PVC, and the CSI driver handles the rest.
- Key Consideration: Your VMware backup agents may not work the same way inside a Kubernetes-native VM. Evaluate if your CSI storage for virtual machines supports OADP (OpenShift APIs for Data Protection).
How Does Storage Impact OpenShift VM Performance and Uptime?
The relationship between storage and your SLA is absolute—directly determining whether your OpenShift-V platform can meet enterprise SLAs.
- Slow storage → failed Live VM Migrations → unplanned downtime
- High latency → degraded application performance
- Noisy neighbors → unpredictable VM behavior
In the evaluation phase, consider these two impacts:
1. Live VM Migrations: In a traditional hypervisor, storage is shared. In OpenShift, if your storage backend is slow to acknowledge writes or lacks RWX support, Live Migration will time out. This means every time you patch an OpenShift node, your VMs must be hard-rebooted rather than moved seamlessly.
2. Boot Storms: When 50 VMs restart simultaneously after a maintenance window, the “Boot Storm” creates a massive spike in read IOPS. During your evaluation phase, run a load test to see how your storage backend handles concurrent VM initialization.
OpenShift Virtualization Storage Comparison Summary
| Feature | Ceph | SAN | NVMe/TCP |
|---|---|---|---|
| Ease of Use | Natively Integrated | Proprietary Hardware. Complex | Software-Defined. Simple |
| OpenShift VM Storage Performance | High | Variable (Hardware-dependent) | Extreme |
| Live Migration | Native Support | Requires RWX Config | Native Support |
| Scalability | Linear | Hardware Limited | Extremely High |
Common Mistakes When Evaluating OpenShift Persistent Storage for VMs
- Assuming all CSI drivers deliver equal performance
- Ignoring tail latency capabilities
- Not pre-testing boot storms in the evaluation phase
- Overlooking RWX requirements
- Underestimating migration bandwidth