In 2016, we will stop putting hard disk drives in our servers.
Solid-state drives (SSDs) have reached a tipping point. Compared with traditional spinning hard disk drives (HDDs), SSDs have much lower latency, as much as 1000 times the number of I/Os per second (IOPS) and three to five times the throughput. Coupled with price parity or better, server SSDs could rapidly replace internal HDDs as soon as this year.
There's no question that servers that use SSD as direct-attached internal storage run faster -- often much faster -- than servers with HDDs. That means fewer servers for the same workload, for significant cost avoidance. It's also possible to retrofit existing servers with SSDs, speeding them up and extending their useful life. Nor is it necessary to replace every hard drive; tiering software can move low-activity data out of the SSD onto the HDD. In fact, adding just two or four SSDs and converting the hard drives to bulk storage will still outperform the whole of the previous HDD-based configuration.
Saving money and boosting performance aren't the only wins for SSDs. Some tasks that used to be very difficult are now duck soup. As an example, rapidly scanning video during editing used to require big, fast RAID arrays to get low enough latency. Now it's possible from an SSD-equipped workstation. Database searches operate much faster too (although the ultimate in performance here is obtained by using in-memory approaches). In the financial world, SSDs bring much lower access latency to the table and, with transaction costs measured in mega-dollars per second, soon pay for themselves.
Server SSDs are even bailing out the cloud. One problem from the start in cloud computing has been the latency and rate of I/O to the underlying storage. The cloud's early stateless servers have given way to demanding applications leveraging graphical processing units and large amounts of memory, with data stored in local drives inside the virtualized servers. While the mega-clouds have tried out HDDs to support large workloads, SSD can simply offer much more I/O.
Taking the server flash plunge
What does this mean for the average data center? A long time ago, mega cloud service providers such as AWS and Google cut the umbilical to traditional server vendors in favor of buying direct from original design manufacturers (ODM) such as SuperMicro, Lenovo, Mitac International Corp. and Quanta. In 2016, mainstream data center managers will begin to follow in their footsteps.
Unlike traditional tier-one vendors, ODMs do not push special interfaces or drivers, making it possible to use generic SSDs with their systems. An SSD in the $300 per terabyte range compares very favorably to a $700 SAS 1 TB hard disk drive from Dell. (It's also worth noting that Dell lists a 960GB SSD at $533 -- meaning that SSD can actually be cheaper than an equivalent-capacity enterprise hard drive.)
This pricing examination challenges the long-standing myth that SSDs are more expensive than hard drives. This is true, but only when looking at the prices charged by array vendors, where the SSD for drives have been modified to work only in their arrays. Buying the same (but generic) product on the Internet actually reverses the picture.
Another flash myth is how quickly it wears out. Today, a flash drive typically wears out after five years of extremely heavy use, much better than the original four years of light use that MLC flash offered. Assuming a server refresh rate of 36 months, most flash products will outlive the overall system's life expectancy with quite a bit of room to spare. Increasingly, there's no reason not to convert primary server storage to SSDs.
Server SSD operational caveats
The high performance of SSDs introduces changes compared with the old HDD approach.
First, SSD pushes the limits of RAID. With SSDs, most RAID controllers become bottlenecks in RAID 5 mode, throwing away a good part of available performance. Consider that four SSDs can handle 1.6G IOPS. That's much more than any RAID controller's XOR engine can support, and also faster than what the RAID controller's CPUs handle well from an interrupt point of view. Thus, when deploying SSDs, it's better to use a RAID 1 or 10 mirror for data protection, which can be achieved using host software.
Next, look at the storage capacity needs of the server, because SSD give better options for rightsizing than HDD. The issue is that all other HDDs are more expensive than the benchmark 1 TB drive, sometimes by considerable amounts. For example, a workload that needs just 128 GB of storage can find a low-cost server SSD of that size, while the smallest available HDD (500 GB) still costs more than $150.
Make sure that the SSD you choose can handle the expected write workload. Most of the time, this really isn't an issue, but some write-heavy use cases such as sensor data storage or surveillance might require extended life drives, which are available from several vendors. For example, Toshiba's PX04 products have write durability that is rated up to 25 full disk writes per day for five years.
Finally, when calculating the costs of server flash drives to justify the project to the CIO, remember to factor in power savings. Standard SSD (at around 2/10ths of a Watt) save an average 10 W over an HDD, and the need for fewer servers generates substantial additional savings. Also, by no longer being locked in to a single vendor for your HDDs and having to purchase capacity up front, SSD drive purchases can be deferred to an as-needed basis. That means that when it comes time to purchase a new drive, they will likely have decreased considerably in price.
Where we're going in flash
The world of hard disk drives is divided in two: SAS-connected enterprise drives, and SATA-connected capacity and consumer models. In the SSD space, the concept of the enterprise drive is less well-defined. We are finding little value in SAS SSD with dual interface ports for servers in a world where servers are virtualized and failure recovery is achieved by starting new instances on another server. Going forward, server SSDs will split in to two camps: NVMe drives with PCIe interfaces for the most challenging tasks and SATA drives for the rest. In fact, the SATA-Express interface allows for both these drive types to talk to the host via the same connector. In that case, SATA drives need no longer be enterprise drives, since all but a few SSDs now have the features needed in a server, for example super-cap backup power to allow the write cache to be flushed to flash upon power failure, higher write durability and better internal error correction.
Overall, SSD pricing will continue to drop, as new technologies become mainstream. Take 3D NAND; this technology enables much denser chips, and we'll see further price erosion as it is perfected. The industry is also nearing the launch of 4-bit-per-cell QLC flash for the bulk storage end of the market, which will sell in the $30 per TB range in the 2017 timeframe.
The other major change that will impact the SSD industry is the arrival of SSDs with capacities beyond that of even the largest spinning drives. HDD vendors are running into the limitations of physics. Helium-filled drives and shingled recording together have made 10 TB drives possible, but any larger capacities will need either heat-assisted magnetic recording, which uses a laser to soften a spot on the magnetic media, or pre-formatted disks with isolated pits for each bit. These are both several years from production. Meanwhile, we've seen the announcement of 16 TB SSDs and we can expect SSD capacity to grow to as much as 30 TB by 2020.
One thing to watch for is the advent of X-Point memory from Intel/Micron this year. This type of memory will challenge NVMe flash in the fastest applications, but it requires changes in the server and in the system software to really bring value, so it will likely be 2017 before it becomes mainstream.