Last year, at several seminars on advanced enterprise virtualization, I asked attendees about their virtualization deployments. Most had deployed VMware with the Virtual Machine File System (VMFS) as the storage file system, but few had deployed virtual machines (VMs) with raw device mapping (RDM) for storage access.
When creating storage for a VM, VMware's cluster file system, VMFS, is the default and follows the simple path, creating virtual disks inside a VMFS logical volume on a logical unit number (LUN). Then when you install a Linux OS, you can use the default. While the simple default settings for creating VMs may speed deployment and appear to reduce management overhead, these default settings pose problems. The central issue is that these settings target the simplest case without regard for particular system requirements. Defaults are often designed for a product evaluation, where ease and speed are the objectives. But particularly when it comes to disaster recovery (DR) and backup, consider the possible long-term effects of these default settings. Let's consider some enterprise scenarios to illustrate how default settings can be a bad move in virtual deployments.
Serverless backup and raw disk maps
As more modern applications are stored, it takes more time to back up or restore an application's data. But this opposes business disaster recovery demands of shorter recovery time objectives (RTOs). Compounding the situation is the ability of modern servers with more processor cores and memory to host higher VM consolidation ratios that can tax limited host backup/restore I/O bandwidth. All three factors have resulted in backup and recovery times that are unacceptable today compared with just a few years ago.
The solution is to move the backup and restore process away from VMs and to a storage area network (SAN) with serverless backup products such as VMware Consolidated Backup (VCB). Serverless backup requires RDMs to SAN LUNs to allow the backup server direct access to the LUN. But if a VM was created with defaults, you must re-create virtual disks and move data into RDM-configured storage to support a serverless backup architecture. This necessitates a major departure from default configurations for virtual environments.
As RTOs become more compressed and data sets increase, many IT organizations have found that while virtualization can significantly improve server and application RTO, data restoration time has become the gating factor. To ward off this burgeoning problem, the trend has been to replicate data to a disaster recovery site or business continuity geo-mirrored data center to speed recovery. Essentially, replication is pre-staging the data at the recovery site. Many IT organizations simply replicate the LUNs that hold their VM images to the recovery or business continuity site.
But using the default configurations results in inefficient use of replication bandwidth and recovery site storage. The problem is that the storage array and array replication software only understand replicating at the granularity of a LUN. Only one VMFS logical volume can be created per SAN LUN. The result of replicating that logical unit number to a recovery site is that the swap and root partitions are replicated in addition to the application data. Replicating swap not only chews up unnecessary bandwidth but is completely useless at the recovery site. Booting a Linux guest OS will overwrite the contents of its swap partition.
The best configuration for Linux guests in a replicated virtual environment is to create three separate LUNs for the guest virtual disks: a swap LUN, a root LUN and a data LUN (or LUNs). When VMs are created, RDMs to each of the three LUNs are mapped. When the Linux guest OS is installed in a VM, the swap partition is created on the swap device, the root partition on the root device and the application data partition(s) on the data device (or devices). This configuration enables selective replication policies to be applied to VMs. For example, the swap LUN would never be replicated, the root LUN would be replicated on an infrequent but regular basis (frequency is tied to configuration management change policies) and the data LUN(s) would be regularly replicated to meet application RTOs. Additionally, archiving policies would include root and data LUNs only.
This brings up another tip for replicating VM virtual disks in an environment where VM snapshots are used for recovery or archive policies. VMware offers two modes of building RDMs: virtual and physical. Snapshots are disabled with physical RDMs but are available in virtual RDMs. When a VM snapshot is taken, normally all virtual RDMs associated with the VM are included in the snapshot. To exclude the swap RDM, you must create the virtual disk on the swap RDM as an independent disk.
The bottom line
Planning a virtual environment for Linux guests requires forethought that almost never includes using the vendors' default configurations for the virtual infrastructure nor the Linux OS installation. Planning for the future, including the use of replication for recovery -- whether to a remote site or a cloud provider -- requires an architecture that separates the operating system files from the swap file and the application data.
About the author:
Richard Jones is vice president and service director for Data Center Strategies at Midvale, Utah-based Burton Group. He can be reached email@example.com.
What did you think of this feature? Write to SearchDataCenter.com's Matt Stansberry about your data center concerns at firstname.lastname@example.org.