Overhead shot of a round table split into three equal parts, with people working on laptops and notebooks at each section.

Roll up your chair: How one small change sparked a DevOps revolution

Steve Fenton
Steve Fenton

My first encounter with DevOps was so simple that I didn’t even realize its power. Let me share the story so you can see how it went from accidental discovery to deliberate practice, and why it was such a dramatic pivot.

The backdrop to this pivotal moment was a software delivery setup you might find anywhere. The development team built software in a reasonably iterative and incremental fashion. About once a month, the developers created a gold copy and passed it to the ops team.

The ops team installed the software on our office instance (we drank our own champagne). After two weeks of smooth running, they promoted the version to customer instances.

It wasn’t a perfect process, but it benefited from muscle memory, so there wasn’t an urgent imperative to change it. The realization that a change was needed came from the first DevOps moment.

The unplanned first moment

When the ops team deployed the new version, they would review the logs to see if anything interesting or unexpected popped up as a result of the deployment. If they found something, they couldn’t get a quick answer, and it sometimes meant they opted to roll back rather than wait.

This was a comic-strip situation because the development team was a few meters away in their team room. It’s incredible how something as simple as a door transforms co-located teams into remote workers.

The ops team raised their request through official channels, and the developers didn’t even know they were causing more work and stress because the ticket hadn’t reached them yet.

Thankfully, one of the ops team members highlighted this. The next time they started a deployment, a developer was paired with them to watch the logs. A low-fi solution and not one you’d think much about. That developer was me. For this post, we’ll call my ops team partner “Tony”.

Shared surprises lead to learning

The day-one experience of this new collaborative process didn’t seem groundbreaking. When a log message popped up that surprised Tony, it surprised me too. The messages weren’t any more helpful to a developer than they were to the ops team.

I could think through what might be happening, talk it through, and then Tony and I would come up with a theory. We’d test the theory by trying to make another similar log message appear. Then we’d scratch our heads and try to decide whether this could wait for a fix or warranted a rollback.

The plan to bring people from the two teams together was intended to remove the massive communication lag, and it did. But further improvements were to come as a side effect, yielding more significant gains.

Resolve pain pathways by completing the loop

As a developer, when you generate log messages and then have to interpret them, you’ve completed a pain loop. Pain loops are potent drivers of improvement.

Most organizations have unresolved pain pathways. That means someone creates pain, like a developer throwing thousands of vague exceptions every minute, and then someone else feels it, like Tony when he’s trying to work out what the log means.

There are two ways to resolve the pain pathway.

  • Process: You create procedures to bring pain below the threshold and to limit the rate at which it is generated.
  • Loops: You connect the pain into a loop, so the person causing the pain feels its signal.

If I’m the one who gets the electric shock when I press the button, I stop pushing it, even if someone in a white coat instructs me to continue the experiment.

With the pain loop connected, I realized we should log fewer messages to reduce the scroll and review burden. Instead of needing institutional knowledge of which messages were perpetually present and could therefore be ignored, we could stop logging them.

The (perhaps asymptotic) goal was to log only the events that required human review, with a toggle that let more verbose logging be generated on demand. Instead of scrolling through a near-infinite list of logs, you’d have a nearly empty view. If a log appeared, it was important enough to warrant your attention.

The next idea was to improve the information in the log messages. We could identify which customer or user experienced the error and provide context for it. By improving these error messages, we could often identify the bug before we even opened the code, dramatically reducing our investigation time.

This process evolved into the three Fs of event logging.

Create positive spirals with delightful deployments

Another thread that emerged from the simple act of sitting together during deployments was the realization that the deployment process was nasty. We created an installer file, and the ops team would move it to the target server, double-click it, then follow the prompts to configure the instance.

Having to paste configuration values into the installer was slow and error-prone. We spent a disproportionate amount of time improving this process.

Admittedly, we were solving this one “inside the box” by improving an individual installation with DIY scripts, a can of lubricating spray, and sticky tape. This didn’t improve the experience of repeating the install across several environments and multiple production instances.
However, I did get to experience the stress of deployments when their probability of success was anything less than “very high”. When deployments weren’t a solved problem, they could damage team reputation, erode trust, and reduce autonomy.

Failed deployments are the leading cause of organizations working in larger batches. Large batches are a leading cause of failed deployments. This is politely called a negative spiral, and you have to reverse it urgently if you want to survive.

At last, a panacea

The act of sitting a developer with an ops team member during deployments isn’t going to solve all your problems. As we scaled from 6 to 30 developers, pursued innovative new directions for our product, and repositioned our offering and pricing, new pain kept emerging. Continuous improvement really is a game of whack-a-mole, and there’s no final state.

Despite this, the simple act of sitting together, otherwise known as collaboration, caused a chain reaction of beneficial changes.

Sharing goals and pain

When you’re sitting with someone working on the same problem, all the departmental otherness evaporates. You’re just two humans trying to make things work.

Instead of holding developers accountable for feature throughput and the ops team for stability, we shared a combined goal of high throughput and high stability in software delivery.

That removed the goal conflict and encouraged us to share and solve common problems together. This also works when you repeat the alignment exercise with other areas, like compliance and finance.

Completing the pain loop

The problem with our logging strategy was immediately apparent when one of the people generating the logs had to wade through them. This is a powerful motivator for change.

Identifying unresolved pain paths and closing the pain loop isn’t a form of punishment; it’s a moment of realization. It’s the reason we should all use the software we build: it highlights the unresolved pain paths we’re burdening our users with.

Pain loops are crucial to meaningful improvements in software delivery.

Reducing the toil

Great developers are experts at automating things. When you expose this skill set to repetitive work, a developer’s instinct is to eliminate the toil.

For the ops team, the step-by-step deployment checklist was just part of doing business. They were so familiar with the process that it became invisible.

When we reduced the toil, the ops team was definitely happier, even though we hadn’t solved all the rough edges yet.

Refining the early ideas

The fully-formed ideas didn’t arrive immediately. The rough shapes were polished over time into a set of repeatable and connected DevOps habits.

The three Fs, incident causation principles, alerting strategy, and monitor selection guidelines graduated into deliberate approaches long after this story.

I developed an approach to software delivery improvement that used these ideas to address trust issues between developers and the business. By reducing negative signals caused by failed deployments and escaped bugs, we increased trust in the development team, enhanced their reputation, and increased their autonomy.

We combined these practices with Octopus Deploy for deployment and runbook automation and an observability platform, which meant the team was the first to spot problems rather than users. When there was a problem, it was trivial to fix, and the new version could be rolled out in no time.

Unlike the original organization, where we increased collaboration between teams, we created fully cross-functional teams that worked together all the time. Every skill required to deliver and operate the software was embedded, minimizing dependencies and the risk of silos, tickets, and bureaucracy.

These cross-functional teams also proved to be the best way to level up team members.

Unicorn portals

You can’t work with a database whizz for long before you start thinking about query performance, maintenance plans, and normalization. You build better software when you develop these skills. You can’t work with an infrastructure expert without learning about failovers, networking, and zero-downtime deployments. You build better software when you develop these skills, too.

When people say they can’t hire these highly skilled developers, they miss the crucial point. A team designed in this cross-functional style takes new team members and upgrades them into these impossible-to-find unicorns. You may start as a backend developer, a database administrator, or a test analyst, but you grow into a generalizing specialist with many new skills.

Creating these unicorn portals is the most valuable skill development managers can bring to an organization. You need to hire to fill gaps and foster an environment where skills transfer fluidly throughout the team.

Roll up your chair

What became a sophisticated and repeatable process for team transformation could be traced back to that simple act of sitting together. It was a small, easy change that led to increased empathy and understanding, and then a whole set of improvements.

Staring at that rapid stream of logs was the pivot point that led to the most healthy and human approach to DevOps.

We didn’t have the research to confirm it back then, but deployment automation, shared goals, observability, small batches, and Continuous Delivery are all linked to better outcomes for the people, teams, and organization. Everybody wins when you do DevOps right.

Happy deployments!

Steve Fenton

Steve Fenton is a Principal DevEx Researcher at Octopus Deploy and a 8-time Microsoft MVP with more than two decades of experience in software delivery.

Related posts