What if Everything We’ve Been Doing is Wrong?

60-wrong-way

After I wrote my last post, I was talking with Donnie Berkholz as we traveled to FOSDEM. Donnie commented on how powerful of a post it was, yet it left the reader hanging. He, and other readers, wanted more. So I’ve taken the liberty of breaking down more of the reasons Enterprise IT needs a “special kind of DevOps” as posted by Andi Mann. I don’t want anyone to think I am picking on Andi personally. Rather, his post reminds me of all the excuses Enterprises give as to why “We can’t change”. As Mick told Rocky, “There ain’t no can’ts!”

  • They cannot achieve the same levels of agility and personal responsibility as a smaller or less complex organization.

Why Not? Principles that teach agility and speed have long been used at large companies such as Microsoft. (Yes, feel free to say Microsoft is a bad example, they are still one of the world’s largest software companies.) Additionally, if one doesn’t want to take personal responsibility for what they produce for a company, maybe they are in the wrong job for the wrong company?

  • They cannot stream new code into production and just shut down for a couple of hours to fallback if it fails.

This is fool-hardy to begin with. The goal of methods such as Continuous Integration is to be constantly building releases and testing them to catch problems before they are released to production. Also, the idea is to test small changes, so you know exactly what breaks, rather than large chunks of code. Large enterprises “cannot stream new code” because they haven’t built the necessary flows in front of production releases to effectively and efficiently test and verify code changes. This requires IT organizations to fully automate their processes all the way down to server builds, a process they often are incapable of doing because of an attachment to the “old way of doing things”.

  • They rarely ever have ‘two pizza teams’ for development or operations (indeed, they are lucky if they have ‘two Pizza Hut teams’).

The size of the team is nearly always irrelevant. Within each Pizza Hut there are tables, and each table consumes the pizza buffet. The goal of DevOps is to increase the flow of the work through those tables so the teams can eat their pizza and leave quicker. As I’ve said before, focusing on the Silos is the wrong way to solve the problem. Rather focus on the grain elevators that move the grain to produce something meaningful.

  • They cannot sign up for cloud services with a credit card without exceeding their monthly limit and/or being fired.

Get an MSA/PO with the cloud vendor or build a Private Cloud. Cloud or no cloud, building strong automation on top of existing VM or server infrastructure can help alleviate many problems in service delivery.

  • They cannot allow developers to access raw production data, let alone copy it to their laptop for development or testing.

Scrub the data. DevOps or not, this is a problem that we’ve solved years ago. When I worked at a major e-commerce site, real data was often required for testing, but that data was always cleaned of any sensitive PII. This is not an issue that is unique to DevOps.

  • They cannot choose to stream new code into production in violation of a change freeze, or even without the prior approval of a CAB.

Once again, one assumes that DevOps is all about willy nilly pushing of code to production. One aspect of DevOps is about increasing the flow of the work through the system by optimizing the centers where value is added. As I’ve discussed before, principles and practices of DevOps actually help things like Change Control.

  • They cannot just tell developers to carry pagers ‘until their software is bedded in’ (not least because their developers have always carried pagers, and on a full-time basis).

If Devs already carry pagers, then they’ve already been told to carry pagers, hence, “they” can indeed tell their Devs to carry pagers. Additionally, bedding in of the software should happen in the lower environments as discussed previously. If you’ve done things right before production, pagers become a tool that is used when things go really badly. It’s a form of monitoring and incident response that becomes meaningful again because you aren’t being paged for endless break fix work.

  • They cannot put developers and operators together because one team works 24×7 shifts in 7data centers while the other works 16-hour days in 12 different locations.

Well, good, they at least have 16 hours a day together. Highly distributed remote teams are becoming more and more common. Technology is evolving to help bring this concept of remote work and people are finding creative ways to work around it. I’m also against the idea that DevOps is all about merging dev and ops onto one team, because that is not the point. The idea, as already stated, is to increase the flow of work between Dev and Ops and build a culture of continuous improvement between the two groups (three groups if you include the business). Dev, Ops, Business, who gives a shit. The point is working towards a common goal, no matter where you sit.

What large IT shops cannot do is be satisfied anymore with the status-quo. They cannot accept the ways of the past any longer, and they have to start thinking about blowing up their way of doing things. They cannot let the castles and fiefdoms of the past get in the way any longer.

I think the single most powerful question any IT shop can ask themselves is, “What if everything we’ve been doing over the last X years is completely wrong?” Start there, and reevaluate everything you’ve been doing to achieve (or not achieve) the results your customers require.

2 thoughts on “What if Everything We’ve Been Doing is Wrong?

  1. Pingback: DevOps: What if everything we’ve been doing is wrong? | Chef Blog

  2. Michael I think you’re absolutely right to debunk excuses for Enterprise not to adopt practices proven to increase productivity and improve quality and increase availability.

    In my experience enterprise teams do manage to adopt continuous integration practices. And if issues are identified, as you note, the cord can be pulled, and people can resolve issues. Further, identifying issue earlier on saves on costs, but also enables teams to keep their promises to external customers. In one example, the customer didn’t see the value in running content management server clusters in some of the lower environments. When the application was deployed to Production, I’m sure you know what happened. It was resolved, but they had to provision another environment that was like production so the clusters could be tested and the issue resolved.

    In my experience many of the shortcuts taken in the lower environments (not mimicking the Production topology for example) are driven by a lack of automated infrastructure, and by a false sense that there might be cost savings by running a Dev environment that doesn’t have key capabilities like clustering. This is a problem in the way many enterprises build their infrastructure in the datacenter. Automation makes it pretty simple to provision and deprovision environments.

    I don’t think the data needs to be identical in all environments. Invariably there will be differences in Production. However, again, using smart automation, and something like a Blue / Green failover minimizes issues.

    I don’t know that DevOps is the best word for what people talk about. It’s certainly not an enterprise friendly word. And I’ve heard it used in the enterprise to describe something very different from what I think DevOps means. It may be too late to rebrand it, but with enough automation, each Silo does not need to get its hand into the process and slow it down–decisions can be made in advance and can be enforced through automation more tranparently, faster, and more consistently than by individuals.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s