By Anil Patrick, Chief Editor, SearchDataCenter.in
"If anything can go wrong, it will." – Murphy's Law
Data backup is not as simple as it sounds, as experienced practitioners will agree. Here's our take on how to select backup and recovery solutions for midsized businesses that make the grade.
With an increasing number of Indian organizations going 24/7, today's business need is for maximum
uptimes and faster recovery in case of problems. A case in point is Meru Cab Company, which operates taxi services in major Indian cities.
With a focus on providing world-class service, Meru has 24/7 operations, a GPS-enabled fleet and the least tolerance for any form of downtime. Hence, the company is determined to have the best possible backup mechanisms in place. As Nilesh Sangoi, CTO, Meru Cab Company, explains, "In our company, cabs work even at midnight, so even the slightest downtime can result in significant losses and customer dissatisfaction. Hence, we are working on aspects such as improving availability and building redundancy. Right now we have LTO tape-based backup using Veritas NetBackup. Subsequently, we plan to move into real-time replication in order to reduce the recovery time."
While there are many organizations like Meru that swear by their sound data backup strategies with an eye on impending growth, the typical Indian midsized business still has a long way to go on that front. Things are not improved by the fact that many an Indian midsized business' backup configuration is at best comparable to a badly prepared khichdi – a hotchpotch of (often obsolete) point solutions that run more on faith than on actual processes and proper implementation. Sadly, these risks usually come to light only when Murphy's adages decide to strike in the form of backup errors, failed restores and performance bottlenecks. The changing business requirements of Indian organizations on the backup front are not helping either, as we shall see shortly.
Increase in requirements, lesser capabilities
The backup-related shortcomings of midsized organizations are further debilitated by the constant increase in data volumes. The main contributors to this growth have been enterprise-wide applications such as enterprise resource planning and customer relationship management, flanked by email and digitization of content.
"Data size is the most common data backup challenge that we face right now. Such growth in data makes it difficult to maintain and monitor the backup structure needed for restores," says Meheriar Patel, CIO, general manager and head of IT, human resources and admin, Globus Stores.
The exponential data growth has brought along with it the perennial plague for backup solutions, viz reliability. This poses considerable risks to organizations that may run into tremendous losses for each minute (or less) of downtime. "The more data IT operations must back up, the more it will strain existing backup operations. Backup is already one of the most time-consuming and error-prone IT operations in the data center," says Stephanie Balaouras, principal analyst, Forrester Research.
Backup windows are also facing a challenge due to the increased data loads. For businesses that operate 24/7, routine backups are a challenge due to the performance hits on applications, servers and networks. Add to this the business demand for more recovery points due to the increasing number of high-availability applications, and the situation becomes quite ugly.
Unstructured data on user desktops (in the form of emails, spreadsheets and documents) is yet another increasingly important backup requirement for midsized businesses. Such data contains significant amounts of business-critical information and is difficult to back up. "While you can use backup methods such as point in time, snapshots and DVDs, the size of that unstructured data is also huge. It's probably not a backup issue, but the fact remains that a lot of that information is not getting utilized effectively in the organization. When that person leaves, the data is backed up, but no one can physically go through that data. Most midsized businesses don't invest in setups which can handle optimally handle such data," says Sangoi.
Thus the constant increase in data volumes has left organizations with no option but to opt for backup setups that rely on faster hardware and networks, supplemented by more bandwidth and efficient backup technologies. So what are the attributes that you should look for when selecting a backup solution?
Get the apt sizing
The first step in the evaluation process is to list your business requirements. It's essential to understand the effect that each minute of downtime will have on your organization and how much is the acceptable point. This will help you determine aspects such as the recovery point objective (RPO) and recovery time objective (RTO) for your business applications.
Once a clear understanding of the impact of downtime and the acceptable levels of data loss have been determined, justifying the required investments becomes easier. It also helps you get a clear grasp of the required backup strategy, since smaller RPO and RTO will translate into substantially increased investments.
"Look at minimizing downtime, but in the most optimal manner. Don't invest in too many things. For example, if you need off-site replication, remember that you need to invest not only in suitable software and hardware, but also in that kind of bandwidth. Real-time replication cannot happen over low-bandwidth links, and such high-capacity links have recurring costs. So you need to balance the business requirements with needed investments," says Sangoi.
More than just backup
Let's get one thing straight at this point: The age of point products to manage backups, replication and recovery is over. These setups result in chaos (and often tears) when it comes to management, reporting and restores. So if you are planning such an approach, read no further.
It's best to opt for an integrated data protection suite that offers capabilities like continuous data protection (CDP) along with backup, snapshots and replication. "IT operations professionals want one console, one technology engine and one metadata repository for data protection — not a bunch of point products for backup, snapshots, replication and CDP. This approach is known as unified data protection (UDP)," explains Balaouras.
With capabilities to provide real-time restores at the file and block levels, CDP is quite useful for backups of high-availability applications. It is also handy when performing real-time backups of unstructured data in laptops and desktops. While evaluating CDP features, ensure that the solution provides flexibility to customize the restore points as per your requirements. CDP may come with additional network and system overheads, so always factor in those aspects (and required additional investments) as well during the evaluation.
Virtual full backup (also known as synthetic backup) is the capability that you can look out for in case you decide that a backup solution with CDP thrown in exceeds your budget. The good thing about virtual full backups is they allow you to rapidly create full restores from an initial full backup along with subsequent incremental backups. Virtual full backups also help the organization reduce backup windows due to the incremental approach.
The next critical aspect to consider is data deduplication. This is perhaps a life saver in the long run since it eliminates duplicate copies of data in backups. Deduplication capabilities are available at the file and block levels. According to Balaouras, backup software offerings that support block-level deduplication are very effective for consolidated remote-office backup as well as virtualized server environments with low to medium transaction volumes.
System restore features are critical, if you need to recover the entire system rather than just the data. Other useful aspects to look for are backup data encryption, granular object recovery and server virtualization support, along with archiving and data migration capabilities.
Many vendors provide backup monitoring services as a value addition for the backup solution. If you are able to negotiate a good deal on this front, it's best to go in for it. As part of such services, the vendor's team provides remote monitoring, periodic reports and alerts (in case of faulty backups). "When signing up for backup monitoring services, the clauses to look out for are those that deal with issues such as fix of product issues, resolution times, response time, statement of work, reports and their formats, escalation matrix and the penalty," says Sangoi.
Backup software Implementation and thereafter Most CIOs are of the consensus that it's no longer very difficult to roll out a backup solution as such. However, it's hard going when it comes to constantly monitoring and tweaking the application to conform to required objectives.
During implementation of the backup solution, it's best to introduce a disk target such as a near-line appliance or a virtual tape library (VTL) rather than tape. The disk target will act as a buffer layer before getting backed up on tape. Such an arrangement will ensure shorter backup windows and faster restores.
The next step is to size your network infrastructure. In the case of high-availability requirements, it might be better to have a separate Ethernet for the backup infrastructure. In case that's not feasible, opt for Gigabit Ethernet for your LAN and 4 Gbps Fibre Channel for the storage area network.
It's essential to ensure that you have the sufficient number of media servers. "Too often, backup environments are not sized appropriately given the amount of data that actually needs to be backed up and protected. Have your backup vendor or another third-party consultant perform an assessment to ensure the environment is sized appropriately. Without enough media servers to handle the size and number of concurrent backups you need to support, using disk might not improve your backup performance if you can't generate the backup streams to keep the appliance or VTL busy," recommends Balaouras.
Once the implementation is over, it's essential to have constant monitoring mechanisms. "None of the software that you have will work seamlessly and automatically. There will be times when the backups fail, and people don't even notice it for days. And when you really need it, you never have it. So the two really crucial things during implementation and post implementation are to regularly monitor and test the backups," says Sangoi. According to him, the best possible way to ensure this is to conduct tests at defined frequencies along with the use of checklists. Some tests will involve plain sample testing to check the tape. Others may include a full restore and have a longer frequency. For example, a sample check can be done on a monthly basis, whereas a full server restore can be conducted on a quarterly basis.
Last but not least, ensure that you use the advanced functionality provided by the backup suite. "Very few customers take advantage of database- and application-specific agents for making consistent backups. Similar is the case with virtual full backups/synthetic backups for reducing backup windows and snapshot-assisted backups for, in some cases, completely eliminating backup windows," says Balaouras.