MySQL Storage: A Complete Explanation
What is MySQL Storage?
MySQL is the world’s most widely used open-source Relational Database Management System (RDBMS). In the entire universe of databases, it’s ranked number two by DB-Engines, the respected database ranking site, right behind Oracle. The Stack Overflow Developer Survey 2020 rated MySQL as the most popular database for that year. You may already know that MySQL is suitable for any application requiring a transactional SQL engine, including mission-critical apps and heavily trafficked websites.
Here’s something you probably didn’t know about MySQL: Michael Widenius, MySQL’s co-creator, named the software for My, his daughter. And you probably thought “my” meant it was “your” personal database.
MySQL’s popularity has led to the RDBMS being widely deployed. And, as with any database, MySQL needs storage. The conventional approach is to use Direct-Attached Storage (DAS) with NVMe flash for low latency and high application performance. But there are tradeoffs you should be aware of with that approach. A more modern approach to storage can deliver better capital expenditure (CapEx) and operational expenditure (OpEx) results.
What are the limitations of using Direct-Attached Storage with MySQL?
The common practice of utilizing DAS with high-performance NVMe flash storage directly for MySQL creates a few expensive trade-offs. This approach ties storage directly to compute, which prevents independent scaling of the compute capacity (CPU and memory) and storage capacity. While the system may start out in balance, the setup quickly becomes quite inefficient as DAS can get maxed out, it can be difficult to manage at scale (high OpEX), and is often overprovisioned (high CapEx) to keep up with unpredictable business demands. It can also result in stranded capacity, silos of storage, and underutilization of the flash. None of which is ideal for infrastructure economics.
For example, if you have four MySQL compute instances, each with its own DAS, two of the compute instances could be at 10% of storage capacity utilized, while the other two might be over 90%. In the former case, that’s a costly instance of stranded storage capacity. In the latter case, internal storage is so full it will invariably be sluggish—and you will have to add another compute instance just to get more storage capacity. The newly added DAS will then be underutilized. Overall, the total cost of ownership (TCO) will be higher than it should be.
How to achieve high performance with MySQL
It is possible to get better storage performance and utilization for MySQL. One approach is to adopt an architecture that disaggregates storage from compute, such as using software-defined storage (SDS) for storage disaggregation. This can be implemented as a cluster storage pool, and when coupled with NVMe®overTCP (NVMe/TCP), can deliver equivalent application performance to DAS with lower (i.e., better) latencies for MySQL. This approach eliminates a lot of the inefficiencies of the DAS model. In addition to being able to scale storage independently from compute, benefits include shared storage architecture for easier manageability and better resource utilization. All of these benefits can have a significant impact on reducing CapEx and OpEx
Benefits of using NVMe / TCP with MySQL in a disaggregated architecture
Taking the idea of utilizing a more modern storage approach for MySQL using SDS coupled with NVMe/TCP adds further benefits to disaggregated architecture. NVMe/TCP is a simple, ubiquitous, and highly efficient storage protocol. Any modern data center network is designed primarily for TCP/IP networking. Any modern Linux server comes with built-in NVMe/TCP drivers. NVMe/TCP uses the standard Linux networking stack and Network Interface Cards (NICs) without any modifications.
Lightbits combines a disaggregated architecture with a pool of NVMe/TCP flash storage that’s provisioned into logical volumes exposed to compute instances as a unique NVMe namespace. As SDS running on standard Linux servers, it is optimized to deliver high performance and consistently low latency for Input/Output (I/O) intensive workloads like MySQL. It’s an effective alternative to DAS:
- Increased Application Performance—The pool of flash memory, with inline data reduction, enables a dramatic reduction in “tail latency,” which can be thought of as the time it takes to complete the straggling I/O requests that require the most time to finish.
- Standard access that’s simple and secure—Lightbits provides full support for standard NVMe over Fabric (NVMe-oF) using standard NVMe connection protocols and Access Control List (ACL) functionality. By using standard NVMe-oF mechanisms, storage administrators can set up ACLs for each NVMe/TCP volume exported by Lightbits. In turn, Lightbits will only let NVMe/TCP-connected clients access volumes for which the client has access permissions.
- Reduced TCO—a disaggregated architecture combined with NVMe/TCP makes it possible for system owners managing MySQL to realize a lower TCO. The need to overprovision to keep up with demand, thereby producing stranded capacity, is eliminated. In addition, system owners can cut TCO by only upgrading storage servers or application servers when necessary, rather than when DAS capacity runs out. There is no need to bundle and waste storage or CPU capacity.
Lightbits ran a comparison between MySQL with DAS and MySQL with Lightbits disaggregated SDS. The first step in the study created MySQL benchmarks by using a standard cloud-based workload where each MySQL node contained NVMe SSDs in a DAS configuration. A machine set up as a dedicated Stress Client ran the benchmark workload on each MySQL compute node. After that, the same Stress Clients ran the same benchmark workload on the same MySQL nodes, but this time with the MySQL nodes using disaggregated storage residing in a Lightbits storage server and accessed over NVMe/TCP.
The study featured the Yahoo! Cloud Serving Benchmark (YCSB) tool being executed on the Stress Clients. In both DAS and Lightbits configurations, the YCSB benchmarking software was configured to run on a MySQL database whose size was calculated to consume ~60% of the usable capacity. The YCSB workload was a mix of 50% reads and 50% updates, with 4K record sizes using a uniform distribution.
The results revealed some striking differences between MySQL with DAS and MySQL with the disaggregated Lightbits SDS. While Lightbits and DAS showed similar “read and update average latency,” in the case of “read and update 99.9% tail latency,” Lightbits was 2X lower (i.e., better) than DAS as workload intensity increased. When hardware-accelerated inline compression was added to the test, Lightbits reduced total database size by 40%. Tail latency dropped another 30%.
MySQL is such a commonly used RDBMS that nearly every IT department will find itself grappling with MySQL storage challenges. The predominant DAS approach is not ideal. It’s inefficient and leads to poor performance and higher TCO than is possible with a disaggregated SDS architecture. As Lightbits demonstrated in its benchmarking study, separating storage from compute with a disaggregated architecture and pooling NVMe flash with TCP/IP enables system owners to improve performance and storage utilization.