Disaster planning relies on one simple principle; a copy of your business-critical data must exist in a physical location other than your primary data center. Off-site tape storage has been the traditional defense against disasters, but the logistical demands of recalling the tapes from storage and then restoring them has always proven cumbersome. Today, the availability of the Internet and other long-distance WAN links has presented disaster planners with the option of remote replication -- copying vital data disk-to-disk between two physical locations across a WAN. Let's review the most important options and considerations involved in off-site tape and remote replication.
Tape is a well-established technology with a proven history in backup and archival tasks. Since tape uses removable media, it's a simple matter to guard against disaster by removing the completed tapes and sending them off-site. Small businesses may entrust tapes to employees, or lock the tapes in a nearby safe. Larger organizations use a storage service like Iron Mountain which will pickup tapes and warehouse them at a secure location -- usually in another geographic region.
But tape technology also presents some critical liabilities. First, tape is a relatively slow technology. As storage volumes spiral upward, it takes more time to complete a backup, and more time to restore that backup. Shippers also introduce delays. It can take up to 24 hours to retrieve tapes from
Tape security is another serious concern. Employees can lose tapes, and tapes can be lost in shipping, all potentially compromising sensitive user data. Today, tape backup software and systems are increasingly turning to encryption to secure user data, though encryption often introduces further delays in the backup process.
To overcome the limitations of tape, companies are implementing disk-to-disk (D2D) technologies for added speed and reliability. However, disks are not removable by nature, and having a data copy in the same location does not protect against disaster. Consequently, disk storage is being replicated remotely to an off-site disk storage system using WAN links (such as the Internet) and replication software that can optimize the use of limited WAN bandwidth.
There are numerous ways to implement remote replication. The easiest method is straightforward replication which creates a simple backup by copying files from the data center to another storage system elsewhere. The remote storage system may be an ordinary disk storage array or VTL. After a disaster, data at the remote location is then moved back to the rebuilt data center or sent to an alternative site so that business may resume.
Another popular approach is to replicate data directly to a secondary data center that duplicates the essential capabilities of the original data center. A "hot-cold" approach basically receives data but keeps the secondary data center offline until a disaster occurs. It can take up to 12 hours (perhaps longer) to bring a "cold" site online. A "hot-warm" configuration is online and receives data from the primary data center, but only takes over processing tasks when the main data center is disrupted. It may take several hours to bring a "warm" site online. A "hot-hot" approach generally keeps a fully functional remote site online and running full-time, usually sharing some of the data processing tasks and remaining constantly synchronized with the primary data center. When disaster hits, the remote "hot" site continues working -- often without noticeable interruption.
Traditionally, a remote site needed to duplicate the storage hardware and configuration at the primary data center. Storage virtualization is changing this requirement. Since virtualization introduces a layer of abstraction between the storage hardware and the applications using it, administrators can employ different or older storage systems at the remote site.
WAN and bandwidth considerations
The key to remote replication is the WAN and its available bandwidth. It is critical that an enterprise evaluate the volume of data that must be synchronized between sites in a period of time, and then provide the bandwidth to accommodate those demands. For example, a typical broadband Internet connection may pass 600 Kb/s (about 270 MB/h), but a company with 2 TB of data to synchronize would need far more bandwidth, requiring T1, T3, or other high-bandwidth alternatives. The cost of bandwidth should factor into remote replication decisions.
There are several techniques that can save money by reducing bandwidth requirements. Delta differential synchronization transfers only the blocks or files that have changed since the last synchronization (similar to an incremental tape backup). Although there may be 2 TB of data at each location, a delta of 100 GB per day can be updated across a broadband Internet link in less than four hours per day. Data deduplication (also known as intelligent compression) is an emerging technique that eliminates redundant blocks and files -- only transferring one iteration of unique information. For example, instead of transferring four copies of a new 100 MB PowerPoint presentation, only one copy would be transferred (saving 300 MB of data transfers).
Companies are also being more selective about the applications and files that are covered by remote replication. Rather than protecting PDFs and PPTs and non-essential Word documents, a replication system may only be configured to support Oracle, Exchange, and other mission-critical business tasks -- relegating other applications and data types to more traditional D2D or tape-based backups.
If the WAN goes down, so does the replication, so a company must decide how to handle inevitable WAN interruptions. Again there are numerous options depending on individual business needs, but typical alternatives include waiting, trying an alternative WAN connection, and falling back to a traditional tape or other local disk.