The IT ops teams of today are flexible and efficient, using technology and management to run their businesses. What’s their secret?
For better or worse, today’s IT service operations team is different from the one you joined five, 10 or 20 years ago, with fewer people and resources managing an ever-increasing number of systems.
The most obvious culprit is the economy. In 2009 and 2010, IT budgets fell sharply. According to Gartner, they shrank 8.1% in 2009, and another 1.1% the year after. And while IT budgets started growing again in 2011, they are only at the level they were in 2005.
At the same time, fewer IT staffers are managing more systems. Gone are the days of the 25:1 system-to-system-administrator ratio; today that number is closer to 250:1.
“This is the new normal,” said Mike Sargent, general manager for enterprise management at CA, the management software vendor. “The effective demands on IT are going up exponentially, and it is under massive pressure to keep costs under control.”
But most IT teams have responded with ingenuity in keeping the lights on and the hard drives spinning. How are successful IT operations teams keeping afloat?
Less Is More
On the systems and facilities side of things, IT teams have been consolidating with abandon. Virtualization has famously been the primary way for IT departments to keep up with demand while keeping a lid on costs. Likewise, large companies consolidated multiple data centers into fewer centralized ones, slashing costs while also establishing consistent standards and better availability across far-flung geographies.
All that consolidation has had serious implications for IT operations staffs. One manufacturer went from having IT staffers in all its regional facilities to a shared service center model. Festo, a global industrial automation and pneumatics manufacturer with 15,500 employees in 15 countries, eliminated most local IT ops people in the move and regrouped them in one of three company data centers.
“We operated locally for a long time, but then we worked to regionalize and centralize IT operations,” said Steve Damadeo, Festo IT operations manager for the Americas.
For example, Festo used to run mail services in 60 countries; now mail is centralized in its European, North American and Asian data centers. Festo plans to further consolidate services like file and print into these regional data centers as well. “There’s no point in having 10 systems when I can do it with two,” he said.
Enterprise IT has also learned to streamline common tasks so they can be handled by IT operations teams in less expensive parts of the world.
Mark Szynaka is a cloud architect at Network Strategies Inc., an IT consultancy in New York City, and has worked at several large Wall Street firms. There, his work consisted largely of automating network management processes so they could be handled remotely.
Using ITIL (or IT Infrastructure Library) best practices, Szynaka said the goal was to take a rules-based approach to discovering errors and correlating events and to interface with in-house ticketing systems manned by company help desk staff sitting far from headquarters.
“Most of the day-to-day care and feeding is going to Texas and India,” Szynaka said—especially tedious tasks. Under this system, only those problems that cannot be easily reconciled “bubble up” to senior, in-house IT staffers.
Indeed, tiering IT service operations tasks and teams has become increasingly common. The California health care giant Kaiser Permanente split its IT operations team in two a couple of years ago. The first is an offshore team of Kaiser IT employees focused on “red” and “green” systems availability using the IBM Tivoli Netcool/Omnibus operations management console; the second is a critical application support team in the U.S. that proactively addresses performance problems using vCenter Operations, VMware’s performance analytics suite.
Creating the two teams was “a natural consequence” of implementing performance analytics, said Ian Dodd, Kaiser Permanente’s director for service delivery management.
“Availability monitoring is by definition reactive. A ticket comes in, they react,” Dodd said. Not so with possible performance problems. “You can’t take a ticketing approach to a problem that may happen in a couple of hours; you have to give these guys time.”
But there are costs to distributing IT operations staffers around the globe. While this global village ethos makes it easier to hire for specific technical skills or language skills (to say nothing of staffing the third shift), many IT pros miss the old days.
“There are days when I think that would be much easier if everyone were in the same room,” said Festo’s Damadeo. He laments the constant video conferences, and struggles to tone down his fast-talking New Yorker style. And it’s tough to ensure that everyone is on the same page. “I spend a lot of time on communication,” he said.
Automate All Things
Ultimately, consolidation and centralization are a game of diminishing returns. Many organizations have hit a ceiling on how many systems they can virtualize or how many virtual machines (VMs) they can stuff on a server. Likewise, latency and bandwidth limitations curtail global organizations’ ability to centralize processing into a single data center facility, and staffs simply can’t get any smaller and still run effectively.
With these tactics maxed out, where does IT turn for greater efficiency?
In a word: automation. Like countless industries before it, IT operations pros are hard at work automating time-consuming and error-prone workflows in search of efficiency.
Cypress Semiconductor Corp. in San Jose, Calif., has IT service operations distributed around the world and, over the past couple of years, has worked to automate onerous IT workflows such as creating and deleting employee email accounts. Before automation, the process used to take a week or more, said Venki Sundaresan, senior IT director, but can now be accomplished in about one day, including obtaining all the necessary approvals.
Cypress uses the OpsOne IT process automation platform, a Software as a Service (SaaS) offering from Appnomic. The fact that OpsOne is delivered as SaaS makes it low maintenance for the IT team, and automating complex processes has freed the operations team for more strategic, value-added work, said Sundaresan.
But many IT operations pros are resistant to automation, fearing that it will put them out of a job. That’s a fallacy, said John Allspaw, senior vice president of technical operations at Etsy.com, an online craft marketplace. “By that measure, Google would have 50 people working for them,” Allspaw said.
In fact, Allspaw has found the opposite to be true at Etsy: “The more automation we have, the more people I hire,” he said, as more efficient workflows open up the doors for IT staffers to pursue new projects.
Further, you can’t just automate a process and send staff on their merry way. “When things go wrong with automation, which they absolutely and always will, we need to have the people that wrote the automation to introspect it and to help repair it,” Allspaw said.
In particular, configuration management and provisioning have recently benefitted from automation advances, Allspaw said. Etsy, for example, makes heavy use of Opscode Chef and Cobbler, a Linux installation server.
But automation is no silver bullet, Allspaw conceded. “When automation is your hammer, then every problem looks like a nail,” he said. But not every problem needs to have automation as its solution. “The question is: Should we automate it, how and where and why should we automate it?”
And when all else fails, you can always outsource the IT operations grunt work.
Take network monitoring. “It’s a big commitment. A lot of firms don’t have the manpower or the expertise and aren’t interested in hiring [them],” said Craig Schotke, manager for the monitoring services team at YJT Solutions, a tech consulting firm in Chicago that offers network monitoring services using CA Nimsoft Monitor.
In recent years, YJT has seen a surge of interest in outsourced network monitoring, initially from small emerging companies, and more recently from established midsized companies. Oftentimes, they’ll have the network monitoring in-house, but will choose to outsource the management of it.
“The trend we’re seeing is of companies reducing the size of their IT staff,” Schotke said. He believes that in the next few years, the majority of companies will have outsourced their email, and “that trend is going to continue with other services.”
In organizations born in the cloud, IT operations staffs are already practically nonexistent, said cloud architect Szynaka of Network Strategies.
Szynaka counts several companies in the media and entertainment industry among his customers, whose internal IT staff consists largely of data managers and programmers—“the heart of the strategic differentiation for their business.” For infrastructure, they use Amazon Web Services extensively, combining turnkey Amazon Machine Images (AMIs) like content delivery networks (CDNs), databases and Hadoop into virtual private clouds created by Szynaka and his cohorts.
“We build [our customers] a virtual private cloud, then give them the keys and lock ourselves out,” Szynaka said. “They don’t need an operations team—they have me.”
This article originally appeared in the December/January issue of Modern Infrastructure.
Dig Deeper on Data Center jobs and staffing and professional development
How to save DevOps from cargo cult programming
What enterprises learn from software failure incidents
QCon New York Sessions - Incident Response with Etsy
Open source, adaptable infrastructure key to Etsy platform business model