Contrary to popular belief, SSDs aren't new. The first SSD was created by StorageTek in 1978. Until 1995, most SSDs were actually PC-sized devices with row after row of RAM sticks, a big internal battery and a SCSI interface. These SSDs, now differentiated as "RAM disks" (not to be confused with RAM disks created in main memory), were extremely fast but insanely expensive, and both memory and batteries were sources of frequent and costly failures. In 1995, M-Systems introduced the first Flash-based SSD. Since then, Flash memory, thanks to high demand from today's variety of personal gadgets, has only gotten better, faster and cheaper, and has increased in both reliability and capacity.
Technically there is very little to know about modern Flash SSDs before getting started, except that they come in two varieties: single-level cell (SLC) and multi-level cell (MLC). SLC-based Flash is extremely fast but has lower capacity and is more expensive. MLC is not as fast as SLC but has higher capacity and is less expensive. When you buy an SSD, first determine which type of device it is. Besides that, it will act like any hard drive. In the past there were concerns about Flash cells going bad over time, which would eventually render the drive useless, but these problems have largely been overcome.
Because SSDs produce no noise, offer fast access, use less power and are unaffected by vibration (see how vibration affects disk latency in the YouTube video below), it is considered ideal for laptop use. Who wouldn't love faster wake-up times, more battery life and drop survival? But how do SSDs fit into data centers?
Let's examine three methods in which SSDs are incorporated into the data center.
SSD Method 1: Faster disk
In this scenario, SSDs are treated as nothing more than really fast disks. You might replace15K RPM SAS drives with new SSDs in the same way that many replaced 10K RPM drives with 15K.
While you will indeed have a fast disk, the cost is very high and thus best reserved for limited applications, such as databases.
SSD Method 2: Tier 1 disk
In this scenario, a hierarchical storage manager (HSM) uses SSD for top-tier data,15K or 10K RPM for second-tier and 7,200 RPM disk for third. HSM in short, is software that controls the flow of data across multiple tiers, providing different speed and capacity trade-offs. Tier 1 may consist of a couple hundred gigabytes of fast SSD, and tier 3 may consist of many 1TB 7,200 RPM drives. The software "sloshes" data between the tiers based on frequency of use so that the data you most often require is always on the fastest disk.
The primary advantage of HSM is that algorithms handle all the hard work of deciding what should be on which disks.
Historically, HSM has primarily been used to utilize the high capacities of fast tape, such as LTO4, as the last data tier, however due to the availability of low-cost 1TB drives this is becoming less common.
Examples of HSM software include CommVault DataMigrator, VERITAS Enterprise Vault, Sun Microsystems SAMFS/QFS and Quantum StorNext.
SSD Method 3: Hybrid pool
Sun's FISHworks team created the most interesting solution for putting SSD to work, known as the Hybrid Pool. Filesystems, which uses DRAM for caching (file system cache) and disk for storage. One way to improve file system performance is simply to add more DRAM main memory. However, that is an expensive option. SSD is faster than hard drives but slower than DRAM, but it's also cheaper than DRAM and more expensive than hard drives. Therefore, it logically falls directly between memory and disk, creating unique new possibilities.
Solaris ZFS was extended in two ways to capitalize on this. First, a second-level caching capability was added to the file system cache (L2ARC). This means that rather than flushing old data when you've filled up your file system cache, it can be relocated to SSD drives, extending the amount of file system cache with a slower but much more affordable device. Second, using fast SLC SSDs we can commit writes faster than we can to hard drives, acting like traditional write-back caches.
Let me elaborate on this second point. The success of most enterprise storage arrays has been due to their use of NVRAM to cache synchronous writes. These are writes that are flagged (O_DSYNC) as needing to be immediately put on stable storage prior to acknowledgement. This is why NFS performance is often slow, and why solutions such as NetApp Filers all have NVRAMs onboard. ZFS can now use SSDs for this purpose rather than expensive PCI board solutions, which are more expensive than SSDs and have much smaller capacities. For example, only the very largest NetApp Filers have more than 1GB of NVRAM.
What the future of SSD holds
The hybrid method seems to be catching on quickly. Even NetApp introduced the Level 2 caching concept by the name "FlexCache." The clear benefit of the hybrid concept is reduced administrative intervention. The file system generally knows what you need and SSD can supply it the additional resources to meet demand.
Even so, for large databases where access needs to be highly uniform regardless of usage patterns, building volumes directly on SSD may be the best way to go.
ABOUT THE AUTHOR: Ben Rockwood is the director of systems at cloud computing infrastructure company Joyent Inc.. A Solaris expert and Sun evangelist, he lives just outside of Silicon Valley, Calif., with his smokin' hot wife Tamarah and their three children. Read his blog at cuddletech.com.
This was first published in January 2009