Get Your Head Out of Your aaS

3815168722_faee10cf62_bI’ve been floating between the worlds of Cloud and DevOps for a while now and it is interesting to see the Cloud world finally start to realize the real value is in DevOps. It’s great that more people are starting to pay attention to this cultural and professional movement. What is not great is how the Cloud experts tend to get wrapped up in some debates that are trivial and meaningless, in the larger scheme of things. Take for instance two persistent debates I am seeing over IaaS vs. PaaS, and then which PaaS is better. I hate to be the one to break it to these camps, but it doesn’t matter; at the end of the day you are selling plumbing fixtures that crap flows through.

To understand what I mean, lets take a step back. In 2008, I started pursuing my MBA at The Ohio State University. One of the core requirements of the degree was Operations Management. In Operations Management, you learn manufacturing optimization through ideas such as Lean and Six Sigma. The book “Learning to See” was part of the course material and it focused on optimization of manufacturing processes through visualization, also known as Value Stream Mapping. As the course progressed, I had a personal epiphany. As we kept walking through manufacturing processes, and Value Streams, what I quickly realized was that the work we did in IT was all about manufacturing a good or service someone would be consuming. Automation in the IT world is about (or should be about) optimizing these Value Streams and (hopefully) eliminating waste from the system. My Operations Management course really taught me to see (pun intended) and to think differently about how we worked in IT.

I took this new found knowledge back to my work where it was summarily ignored by my boss and coworkers, and lacking support I shelved my ideas. Little did I know many of the Lean principals I had learned would be at the forefront of how IT is changing today, and was already being changed at that time in 2008, I just didn’t know it.

When somebody asks me what DevOps is, I often respond with the simple idea that “DevOps is about increasing the flow of work through IT.” I borrow this idea heavily from “The Phoenix Project“, but I find it is the most simplest way to capture the essence of this cultural and Imageprofessional movement. And that is where Value Stream Mapping and the ideas of Lean come into the conversation. Books like the “The Phoenix Project“, and notable DevOps contributors such as John Willis expound the values of these techniques to optimize the IT Manufacturing chain, be it Development work or Operations work.

Value Stream Maps are relatively simple. They identify the flow of a raw material through Screen Shot 2014-04-03 at 11.07.33 PMvarious processes that add value. They also identify areas of waste in the system, and they help in building the Future State Map, or the Value Stream that you want to achieve in the future after optimizing the system. The most basic and valuable thing about Value Stream Maps is how they allow you to easily visualize your work, and once it is visualized it is easy to understand and optimize.

If you look at the first current state map, you can easily see how relabeling the boxes to reflect common IT tasks, say in a server build process, makes this a powerful tool for IT. Replace the box names with another process – maybe code build, testing, and release – and you see once again how Value Stream Mapping is a key tool in fixing our broken IT.

Now that we’ve established a method for the optimization of our IT processes, let’s go back to thinking about Cloud and the debates around Iaas, PaaS, and the PaaS vendors. Take the second Value Stream Map. Say this diagram more accurately reflected server builds and the time it took to install an OS was one hour. We optimize this process through our IaaS based Cloud, public or private, and get the time down to 5 minutes. That is awesome, we’ve saved 55 minutes and really optimized that process. Go team!

If “premature optimization is the root of all evil”, then local optimization is the Devil’s half brother. In the above example we saved 55 minutes, but the total time of work flowing through the system is still 67 days, 23 hours. And that is where we come back to Cloud. IaaS is a local optimization. It is great, it is awesome, but it is a very small piece of the puzzle. PaaS is another local optimization, but instead of optimizing one process it optimizes three or four. Which is great, but many IT organizations are going to “adopt Cloud for business agility and speed, then be sadly surprised when their local optimization does little to fix their screwed up system. Cloud can be a great enabler, but it is only a small piece of the larger system. It is high time more of us focus on the larger system.

You Build Kingdoms Because Your Mother Didn’t Love You

Mother-Child_face_to_faceDestruction of silos is all the rage in DevOps and has been since the beginning of the movement. Patrick Debois wrote a very intelligent piece on why silos exist and how they came about as a management strategy. While the post explains why hierarchy style of management came about in the US (General Motors and Sloan), it doesn’t cover some of the personal motivations as to why silos or management kingdoms come about.

Parkinson’s Law

Over the last several years Bike Shedding – or more appropriately Parkinson’s Law of Triviality – has become very popular in technology.  But in all the trivial debate, it seems more technologists have missed C. Northcote Parkinson’s other law, aptly named Parkinson’s Law.

Parkinson’s Law simply states “…that work expands so as to fill the time available for its completion.” Any life long procrastinator will immediately know this is true, as does anyone that has attempted to do any level of project management. Further, Parkinson explains that this expansion of work, also creates an expansion of people doing the work. While Parkinson was focused on governmental organizations, Parkinson’s Law can also apply to other organizations.

Parkinson attributes this expansion of work to two factors:

Factor I.—An official wants to multiply subordinates, not rivals; and

Factor II.—Officials make work for each other.

The Law of Multiplication of Subordinates

Now as work expands, for whatever reason, Parkinson explains that the worker (worker A) must find ways to handle the workload. Parkinson explains that they must find a solution and have three options:

  1. Resign
  2. Split the work with colleague B
  3. Hire subordinates.

And as Parkinson points out, any rational actor is going to choose #3, and in doing so will at a minimum hire 2 subordinates, workers C and D. Hiring one subordinate would be effectively equal to #2, splitting the work with C instead of B, and thus increasing the pool of competition (A, B, and C would effectively be at the same level at this point). Thus the rational choice is to hire 2 or more subordinates, and in doing so A can leverage C and D against one another, holding a possible promotion out as a carrot in order to keep C and D in check.

Of course, work expands, eventually C and D become too busy, and thus they must make the same choice that A had to when they were hired. Rational actors as they are, they choose option #3, each hire 2 subordinates (at least), and please welcome E, F, G, and H to the company. Worker A now has a beautiful fucking kingdom; self-loathing because of an unloving mother notwithstanding.

The Law of Multiplication of Work

As Parkinson points out, seven people are now doing what one once did. Instead of simply expanding to fill time, work now begins to multiply. Why? Well the workers begin to create busy work for each other. The example Parkinson gives follows as such:

“An incoming document may well come before each of them in turn. Official E decides that it falls within the province of F, who places a draft reply before C, who amends it drastically before consulting D, who asks G to deal with it. But G goes on leave at this point, handing the file over to H, who drafts a minute, which is signed by D and returned to C, who revises his draft accordingly and lays the new version before A.”

Parkinson continues his example documenting the busy work that this kingdom produces, much of it useless, and leaves the example with A leaving the office for the day:

“Among the last to leave, A reflects, with bowed shoulders and a wry smile, that late hours, like grey hairs, are among the penalties of success.”

Success indeed my King, Success indeed.

Sound Familiar?

If at this point, you haven’t seen the slightest reflection of an org you know, work at, or have worked for then please let us all know the magical organizational utopia that employs (or has employed) you. Snark and rage aside, this really highlights the problems of many organizations; big kingdoms built to produce very little of value other than process and busy work. And that is why the DevOps Silo Rage gets so much airtime. Process and busy work are there to further the growth of the kingdom, not feed the soul of the individuals at the bottom.

This also highlights why DevOps focuses so heavily on borrowing from things like Lean Manufacturing. Lean emphasizes getting rid of unnecessary process and waste, in order to focus on value creation activities. It also empowers the individuals – E, F, G, and H in the example above – to shape how the value creation process should actually work and what processes are wasteful.

Now reflect on A. What if (s)he came in one day and E, F, G, and H wanted to revamp all the “meaningful” process that keeps them in check. What do most kings (or queens) do when a revolt happens? Kingdom in jeopardy, they squash the rebellion and execute the leaders of the rebellion. Sometimes the monarchy throws carrots to the rebels to appease them just enough to keep the rebellion down.

And that is my rub with the Marketing Driven DevOps drivel being produced today. It’s a fucking carrot to appease the rebels in order to keep the status quo, kingdoms intact, and incumbents in bed with the monarchy. It’s an illusion to pretend you’re doing something new, and at the end of the day thinking, “All this hard work is just my price for my success.”

Parkinson’s Law – The Economist – November 19th, 1955 – http://www.economist.com/node/14116121

What if Everything We’ve Been Doing is Wrong?

60-wrong-way

After I wrote my last post, I was talking with Donnie Berkholz as we traveled to FOSDEM. Donnie commented on how powerful of a post it was, yet it left the reader hanging. He, and other readers, wanted more. So I’ve taken the liberty of breaking down more of the reasons Enterprise IT needs a “special kind of DevOps” as posted by Andi Mann. I don’t want anyone to think I am picking on Andi personally. Rather, his post reminds me of all the excuses Enterprises give as to why “We can’t change”. As Mick told Rocky, “There ain’t no can’ts!”

  • They cannot achieve the same levels of agility and personal responsibility as a smaller or less complex organization.

Why Not? Principles that teach agility and speed have long been used at large companies such as Microsoft. (Yes, feel free to say Microsoft is a bad example, they are still one of the world’s largest software companies.) Additionally, if one doesn’t want to take personal responsibility for what they produce for a company, maybe they are in the wrong job for the wrong company?

  • They cannot stream new code into production and just shut down for a couple of hours to fallback if it fails.

This is fool-hardy to begin with. The goal of methods such as Continuous Integration is to be constantly building releases and testing them to catch problems before they are released to production. Also, the idea is to test small changes, so you know exactly what breaks, rather than large chunks of code. Large enterprises “cannot stream new code” because they haven’t built the necessary flows in front of production releases to effectively and efficiently test and verify code changes. This requires IT organizations to fully automate their processes all the way down to server builds, a process they often are incapable of doing because of an attachment to the “old way of doing things”.

  • They rarely ever have ‘two pizza teams’ for development or operations (indeed, they are lucky if they have ‘two Pizza Hut teams’).

The size of the team is nearly always irrelevant. Within each Pizza Hut there are tables, and each table consumes the pizza buffet. The goal of DevOps is to increase the flow of the work through those tables so the teams can eat their pizza and leave quicker. As I’ve said before, focusing on the Silos is the wrong way to solve the problem. Rather focus on the grain elevators that move the grain to produce something meaningful.

  • They cannot sign up for cloud services with a credit card without exceeding their monthly limit and/or being fired.

Get an MSA/PO with the cloud vendor or build a Private Cloud. Cloud or no cloud, building strong automation on top of existing VM or server infrastructure can help alleviate many problems in service delivery.

  • They cannot allow developers to access raw production data, let alone copy it to their laptop for development or testing.

Scrub the data. DevOps or not, this is a problem that we’ve solved years ago. When I worked at a major e-commerce site, real data was often required for testing, but that data was always cleaned of any sensitive PII. This is not an issue that is unique to DevOps.

  • They cannot choose to stream new code into production in violation of a change freeze, or even without the prior approval of a CAB.

Once again, one assumes that DevOps is all about willy nilly pushing of code to production. One aspect of DevOps is about increasing the flow of the work through the system by optimizing the centers where value is added. As I’ve discussed before, principles and practices of DevOps actually help things like Change Control.

  • They cannot just tell developers to carry pagers ‘until their software is bedded in’ (not least because their developers have always carried pagers, and on a full-time basis).

If Devs already carry pagers, then they’ve already been told to carry pagers, hence, “they” can indeed tell their Devs to carry pagers. Additionally, bedding in of the software should happen in the lower environments as discussed previously. If you’ve done things right before production, pagers become a tool that is used when things go really badly. It’s a form of monitoring and incident response that becomes meaningful again because you aren’t being paged for endless break fix work.

  • They cannot put developers and operators together because one team works 24×7 shifts in 7data centers while the other works 16-hour days in 12 different locations.

Well, good, they at least have 16 hours a day together. Highly distributed remote teams are becoming more and more common. Technology is evolving to help bring this concept of remote work and people are finding creative ways to work around it. I’m also against the idea that DevOps is all about merging dev and ops onto one team, because that is not the point. The idea, as already stated, is to increase the flow of work between Dev and Ops and build a culture of continuous improvement between the two groups (three groups if you include the business). Dev, Ops, Business, who gives a shit. The point is working towards a common goal, no matter where you sit.

What large IT shops cannot do is be satisfied anymore with the status-quo. They cannot accept the ways of the past any longer, and they have to start thinking about blowing up their way of doing things. They cannot let the castles and fiefdoms of the past get in the way any longer.

I think the single most powerful question any IT shop can ask themselves is, “What if everything we’ve been doing over the last X years is completely wrong?” Start there, and reevaluate everything you’ve been doing to achieve (or not achieve) the results your customers require.

You’re Not a Beautiful and Unique Snowflake

“You are not special. You’re not a beautiful and unique snowflake. You’re the same decaying Enterprise IT Org as everyone else. We’re all part of the same compost heap. We’re all singing, all dancing crap of IT.” — Apologies to Chuck Palahniuk

Enterprise IT, The SnowflakeI’ve seen a few exchanges from “Enterprise IT” vendors on twitter about the need for “a different kind of DevOps” for Enterprise IT. This culminated with a blog post from Andi Mann from CA on “Big Enterprises Need Big DevOps“. I’ll avoid the proverbial piss taking that could take place on the title alone and instead focus on the content.

First, let me say that Andi is spot on in the problems he mentions with Enterprise IT. Andi highlights that code cannot be “streamed into production” because of change controls. Audit and Compliance is critical for many large IT organizations. Enterprise IT can’t go buy cloud services with a credit card, and so on. In the end, Andi proposes that a new form of DevOps, Big DevOps, is needed to handle the unique nature of Enterprise IT.

But like a first year med student that is trying to impress the professor with an intelligent response, Andi is focusing on the symptoms of the problem, rather than the causes of the problem. Giving a patient a prescription for pain killers because he has a headache will do nothing if the cause of the headache is the patient constantly banging his head against his desk. The only people who benefit from that scenario is the doctor who gets to pay for his boat with the extra office visits, and the prescription drug salesperson that is making their quota (and taking the doctor to steak dinners).

The problem with many Enterprise IT shops is that they think they are a special and unique snowflake. They won’t stop talking long enough to understand how they might actually be their own worst enemy in creating all this process that is not “small DevOps Compliant”. Instead of understanding how the tenets of DevOps can achieve the same goal as many of their legacy processes, they are immediately dismissive.

Take for instance the issues around audit, compliance and change control. Many legacy change controls were put in place because changes to the environment were impossible to track across one or hundred systems. But the ideas of automation and Infrastructure as Code have evolved to help alleviate this problem. Wrapping things like Source Control Management, and Test Driven Development around your automation allows you to 1) have tested infrastructure code, 2) audit what is changing in your environment , 3) have an audit trail of who changed things, and 4) know exactly when it changed. Compare that to legacy change control processes if you will.

If you want to be successful with any large scale organizational change, you need to assume that everything you are currently doing is wrong and be open to change. Attempting to conform the organizational change to the organization just leaves you with the same organization you had in the first place.

Which brings me around to this post from ZeroTurnaround on “Why your organization hates DevOps and won’t implement it this year (again)“. They make excellent points that echoes and reinforces the points made in this post. Enterprise IT won’t do anything about DevOps or Cloud or anything else this year. They are too happy with the status quo. They want the change to conform to them and their processes. But change doesn’t work like that. Change is often hard, but if you dislike change, you’ll dislike irrelevance even more*.

*Props to @jonisick for that great quote.