For all intents and purposes, the innards of a storage system are commoditized. Yet the battle rages on to ensure that different vendors' storage arrays interoperate, while still retaining a level of differentiation.
Distributed computing means data center storage must interoperate with servers from different vendors. This increases the requirement for more standardized storage architectures. Cloud takes standardization demands one step further.
Data center storage capacity has long depended on a relatively unchanged basic technology: spinning magnetic disk. With only a few major disk drive manufacturers -- Western Digital and Seagate, along with Hitachi and Toshiba -- the media is essentially a commodity.
The problem revolves around how arrays are put together with controllers. IT organizations that bought into high-end, high-cost storage, such as EMC's Symmetrix VMAX, want to manage the whole storage estate from a single set of tools. However, when a disk array operation is defined by proprietary software and firmware in the array controllers, it's problematic to create fully functional storage management tools.
Vendors -- IBM with the SAN Volume Controller and EMC with VPLEX, as well as Hitachi Data Systems, HP and NetApp -- have touted proprietary storage management tools as enabling virtualization across a mix of storage architectures. However, the majority of these tools work against only the vendor's storage systems, and in many cases, against only a portion of that portfolio. The search for true, high-function heterogeneous storage management has been fruitless for many end users.
Cloud computing is changing how we regard storage. Workloads are increasingly more mixed, and storage needs cover a range of object, file and block modes with different I/O requirements. However, to enable cloud architectures, the storage infrastructure must be seen as a single resource pool that the organization can automatically apply to the changing needs of workloads. This is possible only through the provision of highly standardized tools. The move has started, but still has a ways to go.
Flash storage to the rescue
Standardized data center storage capacity is difficult to create solely on magnetic disk storage. A medium dependent on the interaction of disk platters and read/write heads requires that intelligent array controllers be tuned to manage the requests of diverse workloads.
Flash storage's approach to data management differs from its approach to hard drives. Flash is a direct access storage architecture; there is no latency while a moving head finds the correct disk area from which it's supposed to retrieve data. The increase in data management speed over spinning disk means flash can be applied to different workload types in the same array. It is also more easily virtualized across different vendors' offerings.
At last, the standardization of storage may be a real promise, rather than just a talking point -- but not quite yet.
There are still a lot of differences in the ways vendors approach flash. Many of the incumbent storage vendors tout a hybrid approach: a separate layer of flash on top of a magnetic disk array. Problems arise when workloads require data that's not in this flash cache layer, and it has to be pulled from the magnetic disk. This makes certain data actions slower than they would be on a 100% magnetic disk array.
Cascading systems of pure flash and magnetic disk arrays might be a necessary step to maximize existing data center storage capacity investments. However, those existing legacy arrays are troublemakers when a single management layer is being created. The EMC ViPR storage virtualization product shows promise, providing a greater deal of control across a mixed storage estate.
An array of connectivity options
All-flash arrays are stiff competition for mixed storage capacity that's tied together. Flash vendors such as Pure Storage, Violin Memory and Nimble Storage provide intelligent software that minimizes storage volumes and offers advanced management of the vendor's systems across a virtualized environment.
Converging upon storage
Converged infrastructure (CI) systems muddy the water somewhat when it comes to cloud storage management.
Nutanix -- a vendor that started out in the storage space -- offers a hyper-CI platform that includes advanced storage management software. IBM's PureFlex System and PureData System, Dell PowerEdge FX2 systems, HP Converged Infrastructure, and other offerings all present approaches to synthesizing the direct-attached storage contained within the CI systems with any external existing arrays or new ones acquired for expansion purposes.
There is also a move to include server-side storage in the form of flash memory on faster connections, such as PCIe. IBM has developed an interconnect for use within its own systems that can further speed up storage. This CAPI connector brings in a sense of the proprietary again -- it is up to IBM whether the connector makes storage interoperable with other vendors' systems to a high level of fidelity. Converged systems still have to be able to pool their resources to share them. This will require far more advanced tooling than we have seen to date.
About the author:
Clive Longbottom is the co-founder and service director of IT research and analysis firm Quocirca, based in the U.K. Longbottom has more than 15 years of experience in the field. With a background in chemical engineering, he's worked on automation, control of hazardous substances, document management and knowledge management projects.