IT operations has a PR problem, unlike the DevOps movement. Could the solution be as simple as embedding operations...
professionals within product development teams, and publicly tooting its own horn? An Etsy.com senior VP thinks that’s part of the way forward.
A leading light of the DevOps movement, John Allspaw is the author of books on Web operations and capacity planning, and a veteran of some of the world’s best-known Web properties. He’s currently senior vice president of technical operations at Etsy.com, the Amazon.com of crafts. We asked Allspaw to compare IT operations pre- and post-DevOps, and to offer suggestions about simple things that IT operations teams can do to boost their profile.
How do the development and operations teams work at Etsy?
John Allspaw: At Etsy, development is split along functional boundaries like mobile, payments, fraud or search, and each teams gets a “designated ops engineer” that goes to their weekly meetings, knows what projects they’re working on, knows if there’s new capacity that needs churning out. He or she is also responsible for training everybody else in production operations about what’s going on. Because the designated ops person is involved early on in the development process, we have the alerts and the infrastructure set up before the code is even written, and operations has a much larger role in design.
Do you think that DevOps model can work in non-Web, non-product development-oriented organizations?
Allspaw: Do I think that this model could work elsewhere? Absolutely. Do I think that it will? I would still say probably not, because culturally, IT is still very much viewed as a cost center rather than an enabler, the same way that the people who take out the trash are viewed as a cost center.
What can IT do to change that?
Allspaw: IT is largely regarded as a black box. One way we’ve addressed that is to extend the monitoring that we use for the website and apply it to corporate IT. Take wireless. Turns out that wireless networks are reasonably complex. The expectation is that it works most of the time, but there’s no real sense that what you’re doing may contribute to it not working. So we publish a graph with the signal strength and population of who is connected to what access point at any given moment on a dashboard for everybody to see, including the receptionist. That way, if there’s something weird with her wireless, she can’t just throw up her hands and say, “Ah, this stupid wireless!” She’s enabled to see that maybe the wireless access point is overloaded, and it opens up that black box which is IT.
Another thing is to find cheap low-hanging-fruit ways to make the non-technical parts of the organization more efficient, and publicly celebrate them. We magically never run out of paper in any of the printers at Etsy. Why? Because we have a graph for how much paper is in every printer, and we have alerts on it, just like we do for servers. Every time I tell people about that, even at companies like Google, they say, “Wow, that’s a really good idea. No kidding.”
Before you started working in DevOps shops, what was it like working in IT operations?
Allspaw: In traditional enterprises, IT operations had responsibility for availability and to some extent performance. But only senior management had any explicit knowledge of what development was working on. Usually, development would write a bunch of things on their own, and when they were done the reason why it couldn’t go to production was because ops hasn’t ordered servers or put in alerts or something. That’s unfortunate because operations was seen as a blocker, feeding in to the stereotype that if operations had their %^&* together, development would be able to move faster.
Do you use public cloud?
Allspaw: Sure, we try to exploit services that we don’t feel as though we need to take on ourselves. We use Google Apps, PagerDuty and a number of hosted B2B software as a service for things like fraud detection and payment processing. We exploit Amazon Web Services for places that it makes sense: For long-tail photo storage we often use S3, but for hot images that require low latency, we have a hierarchy of caches back in our data center.
But organizations that think they can outsource everything are going to be disillusioned. We’ve got a term in my field: “You own your own availability.” Etsy may use Gmail for corporate mail, but we’re under no illusion that that relieves us of an entire category of things to care about. It is just another choice.
-- Alex Barrett