To configure services for high availability in a data center, shared storage is normally required, which typically means administrators will purchase expensive storage area network (SAN) devices. But there is an alternative in the form of the Distributed Replicated Block Device (DRBD) — a free software component that is available for the Linux operating system. In this tip, you'll learn why DRBD can save administrators the expense of deploying a SAN and how to set it up.
Setting up the synchronization
DRBD can be summarized as RAID 1 (mirroring) over the network. That means you need two nodes and a network to connect them. On both of these nodes, a storage device — typically a local hard disk — is reserved for the DRBD device. Once configured, DRBD takes care of synchronizing these devices in real time over the network. The difference between DRBD and other synchronization hardware and software, such as Rsync, is that DRBD is doing its work at the block layer, which makes it fit for almost every use.
There are two setups for DRBD: active/passive or active/active. The active/passive setup closely resembles RAID 1. Data is written to the active device and replicated to the passive device. Normally, the passive device doesn't do anything, but if failure occurs, it can be switched to become the active device. The active/passive setup is very popular in two-node, high-availability (HA) clusters.
The alternative is to use DRBD in an active/active setup: There are still two storage devices involved, but both can be accessed simultaneously, which means both nodes can operate and serve up data simultaneously. Therefore, the two nodes can both provide data, servicing more users with better performance. This setup comes with several additional requirements, though. To use an active/active DRBD setup, you also need a cluster-aware file system, such as Oracle Corp.’s OCFS2 or Red Hat Inc.’s Global File System. That is because only a cluster-aware file system can guarantee simultaneous writes are properly synchronized over the network and that two nodes can’t write to the same file at the same time.
Making the failover successful
DRBD has become very popular because it allows administrators to configure HA clusters without the need of an expensive SAN. Imagine the case of a Web server configured for HA: If the host that is currently running the Web service goes down, another host in the cluster can take over. In order to continue its work normally, while running on the other node, the Web server needs access to the same documents it had while running on the original node. To ensure your Web server always services the same files, you have to put them on a DRBD device.
Theoretically, you don't need HA clustering software to run DRBD, but having a cluster makes it easier to manage DRBD. Without HA software, the administrator needs to make sure a new node is assigned as the active node after a failure, which involves a manual operation. When included in a cluster, the cluster software will take care of the failover automatically, making sure that, after a brief interruption, the service can start again on the other node. Also, in an active/active setup, HA cluster software is typically used. This is because, on top of the DRBD device, a cluster file system must synchronize access to the device, and a cluster file system is managed by the HA cluster stack.
There are several caveats to consider with a DRBD, and we'll cover them in greater detail in another tip, but the most immediate concern for admins is the connection between DRBD and the HA cluster stack. If the HA stack fails to manage the DRBD device properly, you risk ending up in a split-brain situation where both devices think they're in charge. Fortunately, there is a good manual procedure to resolve issues like that, which I will also cover in another tip.
About the author: Sander van Vugt is an independent trainer and consultant living in the Netherlands. Van Vugt is an expert in Linux high availability, virtualization and performance and has completed several projects that implement all three. Sander is also a regular speaker on many Linux conferences all over the world. He is also the writer of various Linux-related books, such as Beginning the Linux Command Line, Beginning Ubuntu Server Administration and Pro Ubuntu Server Administration.
More resources on DRBD
Build your own iSCSI SAN appliances, save money: Open source SANs, part 1
SAN consolidation reduces costs, boosts performance