Problem solve Get help with specific problems with your technologies, process and projects.

Mainframers need to focus on process not piecemeal disaster recovery

Recent disasters have forced IT managers to consider new aspects of recovery and business resilience, which is one of SHARE's focuses for '07. Some of these questions are discussed here.

Last time I talked about IT 2010, one of the focuses that SHARE has built sessions around for the upcoming meeting. The other focus is business resilience.

The IT environment and business resilience

When you talk about business resilience, you're talking about keeping your business functioning when you've lost a major component. But it's much more than having a disaster recovery site with a bunch of computers where you can replicate your data.

More on business resilience:
Disaster recovery success begins and ends with the basics

Assuring your third-party data center is secure

Real life experiences: Lessons learned in disaster recovery planning
The mainframe is extremely reliable and has a number of features that are specifically designed to allow for replication of your system at remote sites. Non-mainframes have some of these features. But people tend to focus on hardware and software -- concrete facets of business resilience that are easy in the sense that they're finite. But the business processes must be considered, as must the people and how they will get to where they need to be in case of an emergency.

Business resilience is much more than just the technology, which is one of the points that we're trying to make at the SHARE conference.

Tough IT lessons: People matter

There are all kinds of lessons that we've learned from 9/11 and Katrina. One of the biggest factors that are ignored is the people and the processes. What do you do when your people are concerned about their houses being flooded and they're not even coming in to work? You can move your process somewhere else, but if you don't have the people and you don't have the processes in place to work in that environment, you're going to have a lot of problems.

The people were a huge lesson, obviously, in 9/11 where a lot of the financial institutions had the replicated sites and all of that, but what they were missing was a response to a situation where people can't get to the sites. How do you continue operating? Of course, these are unpleasant things to think about, but they have to be taken into account in terms of what you have to do to ensure business resilience.

New IT threats: Pandemic?

People are recognizing that disasters come in different ways. We used be concerned about things like our building burning down. Now we're recognizing that there could be natural disaster, a terrorist attack, or even something like pandemic flu. The latter poses a particularly interesting question: How do you keep your business running if 40% of your people are out?

Imagine a scenario where you haven't lost your facility and don't have to go to the backup site, but you've lost the people who keep things running. One solution is to have people work from home. But that assumes that your telecommunication infrastructure is going to be okay and that there'll be enough bandwidth. But what about when the telecommunications people get sick and are also out? So it's a question of how much you should encompass in your planning.

Traditional preparation may not be enough because you never know what situations might emerge as potential threats. Furthermore, you need to think about all the other dependencies that are taken for granted. We assume that plugging into the power grid will give us electricity.

But what happens when the power plant personnel are hit with the flu, too? What happens when your remote staff can't get network access because the people at the communications company are out? Now you have to start thinking about more than just the basics in your business. Do you have to start putting things in contracts? You have to consider that your business does not function in isolation and that there are dependencies that cannot be controlled on your end.

I remember years ago when we were moving into a new facility, we were trying to make sure that we had redundant power. We decided to get two separate feeders from the power company. That's a good idea, but unfortunately those feeders came from the same substation. So, if someone knocked out one of the power lines between the facility and the substation there would still be power from the other line. But what if the substation gets hit? We learned to work our way back up the supply chain to take these kinds of things into account.

No simple solution to business resilience

Obviously, there isn't going to be a cookbook where you go to the session and people give you a list of things to do and you're done. There is so much to consider, which is why this is one of the themes for the year. The environment keeps changing. The nature of the disasters keep changing and what you want to accomplish keeps changing. So your planning has to change too.

At the SHARE meeting, you will find that some people are talking about hardware considerations and others will be talking about personnel issues and others will be talking about infrastructure, the power and phone lines. The sessions are about looking at the things that you can control and the things you can't and what is the probability of something happening within these areas. You can put dollar values on these things and eventually figure out if it's worth investing in or not.

These are the kinds of questions that we're starting to build sessions on. At previous SHARE events, the sessions have focused on how to replicate systems for a backup site, i.e., classic disaster recovery. Now we are looking beyond the hardware. Conference users should be able to go back to their companies and share some of the things they've learned to make sure that they stay in business if a disaster strikes, which is a great return on investment.

Just some of the sessions addressing business resilience include:

  • Business Resiliency Basics
  • Building Better Resiliency through Data Replication
  • Multi Site Business Continuity Architectures
  • IT Service Continuity: Do You Still Need a Disaster Recovery Plan?
  • IT Service Continuity - It Is More than Keeping Systems Going
  • Planning for IT Business Continuity in a Heterogeneous World
  • Disaster Recovery, Backup & Restore for z/VM & Linux

    One of the real benefits of SHARE is not only do you get all these sessions, but you also get to talk to these people offline and share experiences with what you're doing and ask questions. The gamut of knowledge that comes from a SHARE meeting is astounding. I once said, you can go to a SHARE meeting and ask whatever esoteric question you'd like and someone would probably be there that could answer it. There is just a tremendously broad range of people who are there with different expertise and experiences who are willing to share their knowledge. And that's an invaluable resource.

    So remember, business resilience is an issue facing all IT pros today. Attending SHARE is a good way to jump-start your knowledge on this subject. Hope to see you there.

    ABOUT THE AUTHOR: Robert Rosen is the immediate past president of SHARE Inc. Currently, he serves as the CIO at the National Institute of Arthritis and Musculoskeletal and Skin Diseases of the National Institutes of Health, US Department of Health and Human Services.

  • Dig Deeper on IBM system z and mainframe systems

    Start the conversation

    Send me notifications when other members comment.

    Please create a username to comment.