Victoria - Fotolia

An object storage device for a data overload

For rampant data growth, try object storage on for size.

This article can also be found in the Premium Editorial Download: Modern Infrastructure: Application performance management sets new goals:

Storage systems manage access to information in one of three forms: As blocks of data of a predefined size, as files within a file structure or as objects stored with metadata that describe the information. Block storage systems and file-based network-attached storage systems are pervasive in IT environments today.

Object-based storage systems aren't necessarily new, but a new generation of object storage is gaining traction due to its massive scalability, flat namespace for access, data durability with immutability, self-protection and integrity validation.

The most important aspect of object storage is not the underlying technology or implementation specifics. Instead, it's that it solves problems that are not effectively addressed with more familiar block or file storage mechanisms, especially when it comes to high-capacity growth areas.

For uses such as large content repositories and storage for big data analytics, whether in public clouds from cloud service providers or in private clouds in traditional IT, an object storage device can handle billions of objects and petabytes or exabytes of capacity.

Defining the object storage space

In hierarchical file systems, two files in different directories can have the same name. Hierarchy doesn't exist in an object storage system. With a flat namespace, every object exists at the same level. The flat namespace -- or flat address -- is the storage pool.

Beyond the land of terabytes lies the exabyte, which equals about a billion gigabytes.

Immutable refers to something that can't be changed. Using immutable data means that everything is recorded -- in contrast to systems where new values delete old values, thus removing them permanently. As storage costs drop, immutable data becomes more common.

Objects are containers for information, which could be files or other data created by applications. The objects are stored with metadata, and gain access through a method such as Amazon Web Services S3 or the Cloud Data Management Interface, using commands such as GET and PUT. Objects are ultimately stored as blocks on storage devices, usually disks.

An Object ID identifies the object and the applications that request access. Management of the Object ID as a flat namespace, with the application responsible for retaining the knowledge of its accessible objects, is a distinguishing feature of object storage. This differs from file-based storage or direct access of data blocks on a volume.

Overwhelmed with data

Today's massive data growth can be traced to a few basic phenomena.

Information from ongoing operations continues to increase. IT teams are reluctant to set deletion policies for data that's no longer required, so it's maintained for an extensive period of time. New multimedia applications demand large storage capacity. And finally, sensors, social media and other external, nontraditional sources of data for new analytics applications also produce a significant amount of information to manage.

Object storage technology addresses some of the major problems in storing and managing information, especially unstructured data. Object storage offers two primary advantages (though there are many others):

Scale. Meeting the demands for the increased influx of information means storage must scale to billions of objects -- beyond what typical file systems can do efficiently and still meet performance demands. Object storage's flat namespace allows access to the data object without traversing a hierarchical structure, improving performance and maintaining simplicity.

Durability. Object storage systems have availability and protection characteristics that add resiliency. With the potential need for geographically dispersed access, resiliency is especially important. Also, the size of data repositories practically requires today's storage systems to be self-protecting, since traditional data protection processes cannot run effectively at such a massive scale.

Immutability. Data is stored immutably so that new copies or updates represent new versions -- in addition to the dispersal of data with forward error-correcting codes for availability protection. Managing the number of versions is a basic administrative function with versioning, a characteristic of systems with accessible older versions, preventing data loss from application or usage errors.

Non-disruptive migration. The ability to non-disruptively introduce new technology and retire old technology is a basic need. With data longevity spanning decades, data must advance with new technology without performing a data migration. Object storage devices include information-dispersal algorithms that allow a single node to be replaced with new technology and data to be automatically rebuilt and redistributed, providing transparent technology updating.

Meta management. Object storage also targets managing information, a common storage challenge. Metadata includes information on systems, users or applications stored with the objects. Metadata may also contain rules about access, retention controls, ownership, identification of the app or user who created the information and other inputs for automating management.

To transparently access information from the underlying object technology, utility programs use object storage application programming interfaces (APIs). Traditional file-based applications access objects through gateway devices or object systems supporting file interfaces. Object APIs provide access to both cloud-based and on-premises object storage systems.

Object storage diagram
Figure 1: Object storage enables transparent access to data

The object storage cast of characters

New object storage device products and vendors are appearing all the time, varying in scale and complexity. Some store objects in clustered file systems, while others manage objects as data elements across attached storage devices. Some use network-attached storage with object interfaces to map objects to files without the scalability and metadata management advantages of a true object storage system. 

Some of the prominent object storage vendors include:

Vendor

Product

Amplidata Amplistor
Caringo CAStor
Cleversafe dsNet
Data Direct Networks Web Object Scaler
EMC Atmos & Centera
HP StoreAll
HDS Hitachi Content Platform
NetApp StorageGrid
Quantum Lattus
Scality Ring                     

Popular use cases

Object storage is the foundation of many popular cloud use cases. There are plenty of narrowly defined uses for object storage, but it is easier to understand by looking at the categories. All of these categories apply, regardless of whether services are delivered by cloud service providers or on-premises in traditional IT environments (sometimes referred to as private clouds).

Content repository. This is a very broad area, exploiting the large capacity and numbers of objects possible with object storage. A content repository can be many things, but it generally has fixed content accessed from multiple geographic locations. Access demands can vary, but content repositories usually expect high rates of access with high bandwidth requirements.

Big data analytics. Storing the information coming from sources such as sensors and other nontraditional inputs is challenging, and the scale often drives the decision to use object storage. Real-time analytics applied against the data yields near-term results, but additional value may be obtained from the data through subsequent business intelligence analysis.

Archive. When information no longer needs to be actively accessed or on primary storage with a fast response time, it can be moved to a less-expensive storage system with a different data protection profile. The archived information can come from applications with policy engines to manage retention and data movement, directly from apps or by user-directed movement. Object storage is common because of its ability to scale to up to billions of objects and capacity as well as its self-protection and data integrity characteristics.

Collaboration and sharing. Sharing information globally is becoming more common, leading to many new collaboration and file-synchronization software products -- both on-premises and cloud-based. Object storage systems effectively handle the potential scale, performance and security challenges presented by collaboration and file sharing.

There is economic value from introducing object storage systems into both traditional IT and cloud service provider environments. Enterprises use object storage for existing file-based applications or new apps capable of storing and retrieving objects directly. Today, object storage helps IT improve storage and retrieve information at scale, but it will ultimately become a building block for public and private clouds; more value and more uses will materialize over time.

About the author
Randy Kerns is senior strategist at Evaluator Group. He has more than 40 years' tech experience, teaches regularly and has written two books on storage. Contact him @rgkerns or Randy@EvaluatorGroup.com

This was last published in June 2014

Dig Deeper on Enterprise data storage strategies

PRO+

Content

Find more PRO+ content and other member only offers, here.

Join the conversation

2 comments

Send me notifications when other members comment.

By submitting you agree to receive email from TechTarget and its partners. If you reside outside of the United States, you consent to having your personal data transferred to and processed in the United States. Privacy

Please create a username to comment.

Well, this was an informative article on object storage with a good graphic. Having observed the vendors involved in delivering object storage software over the past several years, I noticed that the list provided by Mr. Kerns did not include Basho (Riak CS) and Cloudian (HyperStore) both of which have been providing their object storage software for several years. Also missing with the OpenStack Swift object storage program, which has bee commercialized by SwiftStack and Ceph, which was commercialized by InkTank and recently purchased by Red Hat. There have also been some recent name changes. Caringo changed their software name from CAStor to Swarm. Amplidata recently introduced a software only product called Himalaya, which will replace their hardware-based product called Amplistor. Quantum's offering is actually a re-branding of Amplidata and not a unique product.
Cancel
Nice article Randy. Object storage does deliver tremendous benefit as a storage solution to meet the massive growth in data creation. One of the most important capabilities you mention (so glad you did) is that object "targets managing information, a common storage challenge" through "meta management." The ability to tag objects with metadata goes beyond enabling the execution of policies, e.g. preservation, protection, retention, etc. It allows for context to be stored and persisted with each object throughout its life cycle. If properly implemented, it can allow that context to be enriched as users and/or applications update or add metadata tags. Metadata can be used to identify relationships between objects, create collections of similar objects and provide a new and dynamic way for data to be perceived. I started talking about objects from within the storage industry in the late '90s as part of developing a solution strategy for a large vendor. Ultimately, the runaway growth in demand for storage capacity and related cost is just a symptom of an information management problem that organizations continue to struggle with. Also, it's important to note how critical metadata is simply based on the government's comments regarding its online capture program, "we're just interested in the metadata and not the actual content" (or something to that effect). Metadata provides the means indentifying relationships between elements to spot trends and behaviors in order to take some type of preemptive action. That's what organizations are looking to do as well through Big Data/analytics initiatives. The purpose for my comment is to augment what you've already said about object storage from the perspective of my current role, which is focused on information governance. Object has the potential to be a strong foundation enabling improvements in legal discovery of information by analyzing relationships between data elements to identify and collect only the most relevant information. Automation of policies is critical as is preserving information in an immutable format to meet regulatory compliance and legal veracity requirements.
Cancel

-ADS BY GOOGLE

SearchWindowsServer

SearchEnterpriseLinux

SearchServerVirtualization

SearchCloudComputing

Close