Essential Guide

Building a disaster recovery architecture with cloud and colocation

A comprehensive collection of articles, videos and more, hand-picked by our editors
Q
Evaluate Weigh the pros and cons of technologies, products and projects you are considering.

Should I automate critical application failover on nodes?

Is it a best practice to automate the startup process after critical application failover from one node to another?

When nodes fail in the data center, applications need to restart as quickly as possible.

IT organizations implement a system to fail over from one node to another to allow rapid recovery of service. Manual intervention to bring an application back up slows this process down -- particularly if the node fails in the middle of the night or on a holiday.

Most critical apps are implemented as daemons or services -- they start automatically when the computer boots up. In this case, the failover starts the virtual machine where the application is installed. Virtualization allows this failover methodology for any application that runs inside a VM.

Sometimes applications need more than just an OS restart. Applications that weren't written as services may need a user to log on to the VM and get the app back up. This is usually only a problem on Windows servers. It is fairly easy to set up with auto-logon and startup applications, but some applications also need the user to click buttons or open menus before the app can run again.

Automated application failover is also possible in this scenario. I use AutoIT scripts to automate application launch after failover. Scripts are good, but this type of automation is fragile: Each version upgrade of the application might break the script.

The biggest problem is with applications that don't like to fail. Applications that require a shutdown process, and cannot recover from an unplanned shutdown, are hard to failover. Generally, these apps require further manual intervention, like listing and removing each database lock. It can be simpler to automate the process of alerts to get these applications fixed than it is to automate the fix processes.

About the author:
Alastair Cooke is a freelance trainer, consultant and blogger specializing in server and desktop virtualization. Known in Australia and New Zealand for the APAC virtualization podcast and regional community events, Cooke was awarded VMware's vExpert status for his 2010 efforts.

Next Steps

A modern approach to uptime.

Five tips to prepare for the future data center.

This was last published in May 2015

PRO+

Content

Find more PRO+ content and other member only offers, here.

Essential Guide

Building a disaster recovery architecture with cloud and colocation

Have a question for an expert?

Please add a title for your question

Get answers from a TechTarget expert on whatever's puzzling you.

You will be able to add details on the next page.

Join the conversation

1 comment

Send me notifications when other members comment.

By submitting you agree to receive email from TechTarget and its partners. If you reside outside of the United States, you consent to having your personal data transferred to and processed in the United States. Privacy

Please create a username to comment.

What is your system for discovering app failures?
Cancel

-ADS BY GOOGLE

SearchWindowsServer

SearchEnterpriseLinux

SearchServerVirtualization

SearchCloudComputing

Close