The Non-Volatile Memory Express (NVMe) specification defines a new method to access solid-state drives (SSDs) over a PCIe bus, and it’s forecasted to grow exponentially over the next few years thanks to its performance advantages and lower latencies compared with legacy protocols (such as SATA). NVMe-enabled storage infrastructures are not only seeing wide deployment, but they’re now entering other data center areas that were once typically reserved for legacy HDDs and SSDs. NVMe has become an agent for next-generation data centers. To understand this evolution, it’s important to review the history and examine some of the new application workloads that are being transformed by this protocol.
Over the past several decades, data storage has followed a model similar to that of data compute, which emerged from a central mainframe architecture and evolved to a distributed client/server architecture. It then returned to a central architecture driven by virtualization, and then back to a distributed architecture driven by web and cloud-based applications. Storage, in turn, has vacillated between direct-attached media and distributed storage-area networks (SANs), where parallel and serial interfaces physically move data between the CPU and storage media using SCSI command sets and SATA/SAS protocols.
With the advent of flash-based SSDs, legacy SATA/SAS protocols initially thrived because they were field proven, compatible and seamlessly integrated in existing systems. Over time, SATA emerged as the most common and cost-effective interface standard despite lengthy performance delays associated with data-access requests and receipts. These delays are expected in hard drives, as the magnetic format requires disk rotations and seek latencies, but for flash-based SSDs that use memory cells (versus disk rotations and seeks), the SATA delays are unacceptable. Ultimately, the SATA protocol was too slow to implement in flash-based storage systems and isolated users from the full benefits of SSDs.
These SATA limitations set the stage for PCI Express (PCIe) to become the next logical interface for storage media, and it, too, was based on the legacy SCSI software stack designed for SATA/SAS interfaces. Having PCIe slots that directly connect to the CPU (providing memory-like access) and a much smaller software stack running on top of it (see Figure 1), the PCIe interface reduces data-transfer latency while increasing bandwidth relative to legacy SATA/SAS interfaces. Although the PCIe interface was a step in the right direction, each SSD required its own proprietary driver, which created more development work for SSD vendors owing to the lack of standardization and added complexities as well as incompatibilities. Hence, the advent of NVMe.
Non-Volatile Memory Express is a standard protocol and driver for SSDs based on NAND flash memory. Developed by an open industry consortium of leading storage, networking and server vendors (NVMexpress.org), the NVMe interface increases nonvolatile-storage performance in PCIe-based servers and SSDs by eliminating the SCSI command stacks and direct-attached storage (DAS) bottlenecks associated with traditional HDD interfaces. It’s a uniquely tuned I/O architecture specific to solid-state media that removes the legacy HDD interface baggage. Conceptually, it requires only one driver to work with every SSD that adheres to the standard.
NVMe-compliant SSDs can deliver up to 10 times the sequential read performance of SATA-based SSDs, enabling more-rigorous application workloads on fewer devices and with smaller physical hardware footprints. Initially reserved for high-performance, high-capacity workloads (at a premium cost), NVMe-based SSDs now serve in other areas of the data center previously reserved for SATA-based SSDs—and they also create a convergence of compute and storage thanks to their widespread adoption.
NVMe and SSDs
The NVMe specification extends PCIe flash storage to new levels. It was specifically architected and optimized from the ground up for nonvolatile solid-state storage. It features a streamlined memory interface, command set and queue design that’s well suited for today’s virtual operating systems. Also, its direct connection to the CPU (through the PCIe bus) streamlines the storage-device stack and delivers faster performance than traditional SATA/SAS interfaces. As a result, all major server manufacturers have added support for NVMe-based U.2 SSDs (2.5" format), which will soon surpass SATA-based SSDs in units shipped.
Table 1 shows typical specifications for SSD media covering SATA, SAS, PCIe and NVMe using leading Western Digital SSD brands (in parenthesis) for comparison. SSDs based on the NVMe specification deliver the best IOPS (I/O operations per second) and bandwidth performance and the highest capacity ranges.
Table 2 compares key features of the SATA protocol based on its Advanced Host Controller Interface (AHCI) with the PCIe protocol and the NVMe specification.
NVMe vs. SATA in the Data Center
Compared with SATA, the NVMe standard delivers better bandwidth and IOPS performance, plus lower latencies. It also provides scalability to main storage devices without the cost or complexities of battery-backed RAID or HBA cards. The advantages of using NVMe-based storage versus SATA-based storage cover a variety of workloads, as the following discussion shows.
Traditional Enterprise Database Workloads
To help scale databases and avoid server sprawl or poor hardware-resource utilization, Microsoft SQL Server, Oracle DB and Oracle MySQL can employ NVMe bandwidth of up to 12 times that of SATA with latency reductions of up to 50 percent. For example, a single database server with SATA-based storage is limited by I/O wait conditions that reduce SATA device performance. The result is slower system operations. A common solution has been to either buy another server and split the workloads or dedicate one server to back-end inventory and another to order entry. In each case, however, two software licenses are required.
Replacing SATA-based SSDs or HDDs with NVMe SSDs can reduce the I/O wait time by 50 percent, enabling both database workloads to run on a single server and requiring only one license. A single core database application can cost tens of thousands of dollars, and a multitude of software licenses can represent more than 60 percent of total cost of ownership (TCO) and total operational cost (TOC). NVMe, however, provides the bandwidth and performance needed for interactive business intelligence (BI) workloads as well as the highest IOPS to support transactional workloads such as online transaction processing (OLTP).
In-Memory Database Workloads
Apache Spark and other in-memory database (IMDB) applications (that rely on main memory for data storage) have storage components that must persist through changes and scan data sets that are normally larger than the combined cluster memory. In this scenario, the higher bandwidth that each NVMe SSD affords is enough to feed DRAM reloads and associated data scans at CPU speeds.
In the case of SATA, multiple drives in a RAID0 configuration are required, creating challenges in mean time between failure (MTBF) and rebuild. A single failure in an eight-drive RAID0 system means the node is invariably dead and must be reloaded; reloading the node, however, puts affected systems into “degraded mode” or, even worse, puts them out of service while the node is completely rebuilt.
Internet of Things Workloads
NVMe is also well suited to aggregating diverse data sources, especially those generated from the newer industrial Internet of Things (IoT) workloads. A factory floor, for example, can have thousands of sensors streaming data at thousands of times per second to a NoSQL database (such as MongoDB or Cassandra) at hundreds of kilobytes per second (KB/s). The advanced bandwidth of NVMe is essential for aggregating these sources into the database while providing enough bandwidth and IOPS to also perform analytics.
IoT devices (e.g., autonomous cars, drones, factory/farm machines and equipment, and surveillance cameras) also collect massive amounts of data. By 2020, connected and autonomous car data traffic per vehicle may reach over 280 petabytes, or 280 million gigabytes annually, according to Gartner. The only way to effectively handle these data deluges is through on-board flash storage and road-side gateways with NVMe devices. But even with this infrastructure, massive autonomous “fleets” will need to push processed data to the cloud for storage and analytics, creating a data tsunami that requires the bandwidth, IOPS and low-latency enabled through the streamlined NVMe stack.
Machine learning can effectively use NVMe-based devices, especially with upcoming support for Direct Memory Access (DMA) extensions that enable certain hardware subsystems to access main system memory independently of the CPU. This access enables the CPU not only to read data faster but also to perform higher-priority tasks faster, such as feeding the graphics processing unit (GPU) arrays. SATA has lower GPU utilization, since data ping-pongs from multiple SATA drives to DRAM and then into the GPU using the host CPU. This process causes performance gaps, especially when a GPU array is data starved.
Just as we experienced the ebb and flow of past compute and storage architectures, we are experiencing another wave of disaggregation in the data center. Unlike the protocols of the past, however, NVMe stands as a bridge between centralization and decentralization, offering flexibility and choice previously unseen as a standard. Stay tuned as NVMe navigates through the future of computing, enabling powerful and versatile systems and taking its rightful place as the data center change agent.
About the Author
Anand Jayapalan is Vice President of Enterprise and Client Compute Solutions Marketing for Western Digital. He is responsible for driving Western Digital’s enterprise and client compute storage solutions marketing. This role includes ISV certifications on the product portfolio to enable focused solutions with leading server and storage OEMs, as well as with emerging SDS partners driving product and solution differentiation and thought leadership for emerging workloads and customer pain points. Before joining Western Digital, Anand led the incubation and growth of SanDisk’s foray into the hyperscale and cloud data center. He received his Master of Science degree in electrical engineering from the University of South Florida and his Master of Business Administration degree from Pepperdine University.