Victoria - Fotolia

An object storage device for a data overload

For rampant data growth, try object storage on for size.

This article can also be found in the Premium Editorial Download: Modern Infrastructure: Application performance management sets new goals:

Storage systems manage access to information in one of three forms: As blocks of data of a predefined size, as files within a file structure or as objects stored with metadata that describe the information. Block storage systems and file-based network-attached storage systems are pervasive in IT environments today.

Object-based storage systems aren't necessarily new, but a new generation of object storage is gaining traction due to its massive scalability, flat namespace for access, data durability with immutability, self-protection and integrity validation.

The most important aspect of object storage is not the underlying technology or implementation specifics. Instead, it's that it solves problems that are not effectively addressed with more familiar block or file storage mechanisms, especially when it comes to high-capacity growth areas.

For uses such as large content repositories and storage for big data analytics, whether in public clouds from cloud service providers or in private clouds in traditional IT, an object storage device can handle billions of objects and petabytes or exabytes of capacity.

Defining the object storage space

In hierarchical file systems, two files in different directories can have the same name. Hierarchy doesn't exist in an object storage system. With a flat namespace, every object exists at the same level. The flat namespace -- or flat address -- is the storage pool.

Beyond the land of terabytes lies the exabyte, which equals about a billion gigabytes.

Immutable refers to something that can't be changed. Using immutable data means that everything is recorded -- in contrast to systems where new values delete old values, thus removing them permanently. As storage costs drop, immutable data becomes more common.

Objects are containers for information, which could be files or other data created by applications. The objects are stored with metadata, and gain access through a method such as Amazon Web Services S3 or the Cloud Data Management Interface, using commands such as GET and PUT. Objects are ultimately stored as blocks on storage devices, usually disks.

An Object ID identifies the object and the applications that request access. Management of the Object ID as a flat namespace, with the application responsible for retaining the knowledge of its accessible objects, is a distinguishing feature of object storage. This differs from file-based storage or direct access of data blocks on a volume.

Overwhelmed with data

Today's massive data growth can be traced to a few basic phenomena.

Information from ongoing operations continues to increase. IT teams are reluctant to set deletion policies for data that's no longer required, so it's maintained for an extensive period of time. New multimedia applications demand large storage capacity. And finally, sensors, social media and other external, nontraditional sources of data for new analytics applications also produce a significant amount of information to manage.

Object storage technology addresses some of the major problems in storing and managing information, especially unstructured data. Object storage offers two primary advantages (though there are many others):

Scale. Meeting the demands for the increased influx of information means storage must scale to billions of objects -- beyond what typical file systems can do efficiently and still meet performance demands. Object storage's flat namespace allows access to the data object without traversing a hierarchical structure, improving performance and maintaining simplicity.

Durability. Object storage systems have availability and protection characteristics that add resiliency. With the potential need for geographically dispersed access, resiliency is especially important. Also, the size of data repositories practically requires today's storage systems to be self-protecting, since traditional data protection processes cannot run effectively at such a massive scale.

Immutability. Data is stored immutably so that new copies or updates represent new versions -- in addition to the dispersal of data with forward error-correcting codes for availability protection. Managing the number of versions is a basic administrative function with versioning, a characteristic of systems with accessible older versions, preventing data loss from application or usage errors.

Non-disruptive migration. The ability to non-disruptively introduce new technology and retire old technology is a basic need. With data longevity spanning decades, data must advance with new technology without performing a data migration. Object storage devices include information-dispersal algorithms that allow a single node to be replaced with new technology and data to be automatically rebuilt and redistributed, providing transparent technology updating.

Meta management. Object storage also targets managing information, a common storage challenge. Metadata includes information on systems, users or applications stored with the objects. Metadata may also contain rules about access, retention controls, ownership, identification of the app or user who created the information and other inputs for automating management.

To transparently access information from the underlying object technology, utility programs use object storage application programming interfaces (APIs). Traditional file-based applications access objects through gateway devices or object systems supporting file interfaces. Object APIs provide access to both cloud-based and on-premises object storage systems.

Object storage diagram
Figure 1: Object storage enables transparent access to data

The object storage cast of characters

New object storage device products and vendors are appearing all the time, varying in scale and complexity. Some store objects in clustered file systems, while others manage objects as data elements across attached storage devices. Some use network-attached storage with object interfaces to map objects to files without the scalability and metadata management advantages of a true object storage system. 

Some of the prominent object storage vendors include:

Vendor

Product

Amplidata Amplistor
Caringo CAStor
Cleversafe dsNet
Data Direct Networks Web Object Scaler
EMC Atmos & Centera
HP StoreAll
HDS Hitachi Content Platform
NetApp StorageGrid
Quantum Lattus
Scality Ring                     

Popular use cases

Object storage is the foundation of many popular cloud use cases. There are plenty of narrowly defined uses for object storage, but it is easier to understand by looking at the categories. All of these categories apply, regardless of whether services are delivered by cloud service providers or on-premises in traditional IT environments (sometimes referred to as private clouds).

Content repository. This is a very broad area, exploiting the large capacity and numbers of objects possible with object storage. A content repository can be many things, but it generally has fixed content accessed from multiple geographic locations. Access demands can vary, but content repositories usually expect high rates of access with high bandwidth requirements.

Big data analytics. Storing the information coming from sources such as sensors and other nontraditional inputs is challenging, and the scale often drives the decision to use object storage. Real-time analytics applied against the data yields near-term results, but additional value may be obtained from the data through subsequent business intelligence analysis.

Archive. When information no longer needs to be actively accessed or on primary storage with a fast response time, it can be moved to a less-expensive storage system with a different data protection profile. The archived information can come from applications with policy engines to manage retention and data movement, directly from apps or by user-directed movement. Object storage is common because of its ability to scale to up to billions of objects and capacity as well as its self-protection and data integrity characteristics.

Collaboration and sharing. Sharing information globally is becoming more common, leading to many new collaboration and file-synchronization software products -- both on-premises and cloud-based. Object storage systems effectively handle the potential scale, performance and security challenges presented by collaboration and file sharing.

There is economic value from introducing object storage systems into both traditional IT and cloud service provider environments. Enterprises use object storage for existing file-based applications or new apps capable of storing and retrieving objects directly. Today, object storage helps IT improve storage and retrieve information at scale, but it will ultimately become a building block for public and private clouds; more value and more uses will materialize over time.

About the author
Randy Kerns is senior strategist at Evaluator Group. He has more than 40 years' tech experience, teaches regularly and has written two books on storage. Contact him @rgkerns or Randy@EvaluatorGroup.com

This was first published in June 2014

Dig deeper on Storage concerns in the data center

Pro+

Features

Enjoy the benefits of Pro+ membership, learn more and join.

1 comment

Oldest 

Forgot Password?

No problem! Submit your e-mail address below. We'll send you an email containing your password.

Your password has been sent to:

-ADS BY GOOGLE

SearchWindowsServer

SearchEnterpriseLinux

SearchServerVirtualization

SearchCloudComputing

Close