Essential Guide

Browse Sections
This content is part of the Essential Guide: Guide to managing data center costs and the IT budget

The real data center storage capacity killer

Heterogeneity in data center storage architectures and controllers is a roadblock to the destination of a standardized infrastructure underpinning diverse workloads.

For all intents and purposes, the innards of a storage system are commoditized. Yet the battle rages on to ensure that different vendors' storage arrays interoperate, while still retaining a level of differentiation.

Distributed computing means data center storage must interoperate with servers from different vendors. This increases the requirement for more standardized storage architectures. Cloud takes standardization demands one step further.

Storage management

Data center storage capacity has long depended on a relatively unchanged basic technology: spinning magnetic disk. With only a few major disk drive manufacturers -- Western Digital and Seagate, along with Hitachi and Toshiba -- the media is essentially a commodity.

The problem revolves around how arrays are put together with controllers. IT organizations that bought into high-end, high-cost storage, such as EMC's Symmetrix VMAX, want to manage the whole storage estate from a single set of tools. However, when a disk array operation is defined by proprietary software and firmware in the array controllers, it's problematic to create fully functional storage management tools.

The search for true, high-function heterogeneous storage management has been fruitless for many end users.

Vendors -- IBM with the SAN Volume Controller and EMC with VPLEX, as well as Hitachi Data Systems, HP and NetApp -- have touted proprietary storage management tools as enabling virtualization across a mix of storage architectures. However, the majority of these tools work against only the vendor's storage systems, and in many cases, against only a portion of that portfolio. The search for true, high-function heterogeneous storage management has been fruitless for many end users.

Cloud computing is changing how we regard storage. Workloads are increasingly more mixed, and storage needs cover a range of object, file and block modes with different I/O requirements. However, to enable cloud architectures, the storage infrastructure must be seen as a single resource pool that the organization can automatically apply to the changing needs of workloads. This is possible only through the provision of highly standardized tools. The move has started, but still has a ways to go.

Flash storage to the rescue

Standardized data center storage capacity is difficult to create solely on magnetic disk storage. A medium dependent on the interaction of disk platters and read/write heads requires that intelligent array controllers be tuned to manage the requests of diverse workloads.

Flash storage's approach to data management differs from its approach to hard drives. Flash is a direct access storage architecture; there is no latency while a moving head finds the correct disk area from which it's supposed to retrieve data. The increase in data management speed over spinning disk means flash can be applied to different workload types in the same array. It is also more easily virtualized across different vendors' offerings.

At last, the standardization of storage may be a real promise, rather than just a talking point -- but not quite yet.

There are still a lot of differences in the ways vendors approach flash. Many of the incumbent storage vendors tout a hybrid approach: a separate layer of flash on top of a magnetic disk array. Problems arise when workloads require data that's not in this flash cache layer, and it has to be pulled from the magnetic disk. This makes certain data actions slower than they would be on a 100% magnetic disk array.

Cascading systems of pure flash and magnetic disk arrays might be a necessary step to maximize existing data center storage capacity investments. However, those existing legacy arrays are troublemakers when a single management layer is being created. The EMC ViPR storage virtualization product shows promise, providing a greater deal of control across a mixed storage estate.

An array of connectivity options

Disk connectivity technologies include IDE, SCSI/iSCSI, SATA and SAS

Array connectivity technologies cover direct-attached, network-attached and storage-area network (DAS, NAS and SAN)

Connectivity predominantly comes in the form of Ethernet or Fibre Channel, with occasional InfiniBand use

All-flash arrays are stiff competition for mixed storage capacity that's tied together. Flash vendors such as Pure Storage, Violin Memory and Nimble Storage provide intelligent software that minimizes storage volumes and offers advanced management of the vendor's systems across a virtualized environment.

Converging upon storage

Converged infrastructure (CI) systems muddy the water somewhat when it comes to cloud storage management.

Nutanix -- a vendor that started out in the storage space -- offers a hyper-CI platform that includes advanced storage management software. IBM's PureFlex System and PureData System, Dell PowerEdge FX2 systems, HP Converged Infrastructure, and other offerings all present approaches to synthesizing the direct-attached storage contained within the CI systems with any external existing arrays or new ones acquired for expansion purposes.

There is also a move to include server-side storage in the form of flash memory on faster connections, such as PCIe. IBM has developed an interconnect for use within its own systems that can further speed up storage. This CAPI connector brings in a sense of the proprietary again -- it is up to IBM whether the connector makes storage interoperable with other vendors' systems to a high level of fidelity. Converged systems still have to be able to pool their resources to share them. This will require far more advanced tooling than we have seen to date.

About the author:
Clive Longbottom is the co-founder and service director of IT research and analysis firm Quocirca, based in the U.K. Longbottom has more than 15 years of experience in the field. With a background in chemical engineering, he's worked on automation, control of hazardous substances, document management and knowledge management projects.

Dig Deeper on Enterprise data storage strategies

Join the conversation


Send me notifications when other members comment.

Please create a username to comment.

Clive – Great article! I work for Zadara Storage, a provider of enterprise storage-as-a-service. We offer storage-on-demand both in the cloud and on-premises. Your article aligns with what we see every day. The one thing that continues to be true in the storage market is that data continues to grow at exponential rates, and the value of that data continues to increase. These two factors are creating an environment where traditional storage arrays (and traditional purchasing models) cannot meet customer needs. Going through an evaluation cycle every 3-5 years to complete a very large CapEx purchase, only to receive an array that is too large initially, and must be grown into, doesn’t make sense. IT departments want, and need, to have instantaneous access to enterprise-class capacity, with all the high-end data protection, business continuity and disaster recovery features they have come to expect – and they should only pay for what they use. They should also be able to adjust the “personality” of the storage as-needed, in any direction (e.g., performance, capacity, drive types, number of drives, etc.). Leveraging cloud economics, scalability and flexibility saves money, makes the IT department more responsive to their customers and shifts the power structure from the vendor to the consumer. IT managers no longer have to purchase expensive hardware and be locked into using it for 3-5 years. Today, if users don’t like the cloud storage solution they chose, they can turn it off and change vendors. As you indicate in your article, this new business model will prove to be very disruptive.
Despite having a number of ways that claim to provide a way to standardize data center storage, none of them actually achieves the goal yet.
It seems like there's a real market out there for dead simple software on the front end that can talk to anything on the back end -- flash, spinning disk, cloud -- and can be configured to perform backups, tiering, etc. based on what hardware is plugged into the back end. But as long as there's money for the big hardware vendors in being proprietary, I guess we're not going to see it.