The DevOps engineer's handbook The DevOps engineer's handbook

Platform Engineering's patterns and anti-patterns

Platform Engineering has been around for a while. Interest recently accelerated, however, thanks to its relevance to DevOps.

For some, Platform Engineering solves a common DevOps Topologies’ problem: Embedding IT operations into DevOps teams that lack time and experience to do it well. This is an anti-pattern, also known as Anti-Type F.

Platform Engineering can help you scale the organization and reduce developer overload.

Use our Platform Engineering decision flow to find out if it’s the right solution to your problems. It also guides you to put key measurements in place early so you can track the value of your internal developer platform (IDP).

We describe the anti-patterns and positive patterns of Platform Engineering below. Let us know if you have a suggestion.

Platform Engineering anti-patterns

If familiar with Team Topologies, you’re less likely to use one of the common Platform Engineering anti-patterns. The library can still be a helpful way to describe problems to your colleagues when you find trouble.

Complex platforms

An IDP should reduce complexity and remove cognitive overload for developers. If the platform is too difficult to use, it failed at a crucial goal. There are several ways a platform can add to a developer’s burden:

  1. It uses a configuration file format that isn’t familiar to the developer
  2. APIs don’t have a consistent convention for methods and parameters
  3. There’s no documentation
  4. The platform doesn’t reduce developer pain

If you have configuration files as part of your platform, be flexible with the format. It should be trivial to parse the same data from a JSON file from web developers and a YAML or XML file from back-end teams. This means the developers can use a format familiar to them rather than one convenient to you.

When creating an API or command line interface (CLI), being meticulous and consistent with method names, parameter names, and the order of parameters can make it easier to use.

Your IDP documentation should be concise and clear, focusing on what’s available and how to use it.

Where developers don’t meet governance, risk, and compliance (GRC) objectives, adding a platform that handles these needs adds more load to development teams. Solving existing developer problems first allows you to solve external requirements without burdening developers.

If your IDP doesn’t significantly reduce developer cognitive overload, complexity, and burnout, it doesn’t work. It also fails if it transfers all the pain to the platform team.

“Field of dreams” platforms

In the 1989 movie Field of Dreams, a voice tells Ray Kinsella (Kevin Costner) that “If you build it, they will come.” In the film, building a baseball park in a cornfield reunites Kinsella with his father and produces a stream of fans coming to watch games at the farm.

If you take the same approach when building your internal developer platform, you’ll struggle to gain internal market share. You need to involve developers from the start and build a platform that solves their specific problems.

If you build the platform before talking to its potential customers, you find yourself with a product that doesn’t fit your market. Some symptoms include:

  • Poor technology fit: Making it easier to manage Kubernetes only to find developers moved to serverless functions.
  • Skill assumptions: Creating a simplified template for microservices only to find developers are microservice experts and don’t value your solution.
  • Forcing choices: Designing a platform with the latest and greatest technology and finding the business doesn’t want to move from stable solutions.

You can’t design your IDP on faith alone. You must talk to developers across many teams to find common pain points your platform could remove. Creating a platform in isolation guarantees low adoption rates.

The skill concentration trap

When you create a Platform Engineering team, you’ll likely add people with the most experience in areas the platform will cover. This can cause a problem when:

  1. The platform team thinks they know better than the development teams
  2. Development teams lose the knowledge they need to run their software
  3. Development teams lose the people who could migrate them onto a platform

Suppose you move people from development teams to create a platform team. You need a strong succession plan and a greater focus on collaboration to upskill the development teams for self-service. You must ensure you don’t create a cognitive overload and burnout by removing skilled team members.

If people leaving the teams were the only ones who cared about operations, the remaining members might see little benefit to the platform. It would be a disaster if developers ignored operations tasks.

You may need to start your platform team with fewer people. You can then solve problems that make it easier for teams to release more people. Choosing platform team members on communication and collaboration skills, and not just technical knowledge, increases the chances of success.

Magpie platforms

People attracted to shiny objects are often called magpies. That’s because members of the crow family tend to take shiny objects belonging to their owners when kept in captivity. A magpie platform focuses on shiny new technologies and not solving problems developers face with existing systems.

Building a platform to support a greenfield project is attractive, but there are far better ways. You make more impact by helping teams with their existing production software.

You build a better platform for the new software once you can identify the areas where developers struggle. When a development team adopts new technology, it can look like they need help with everything. You should let them establish their knowledge for the actual pain points to emerge. Many of their early pains will disappear when they get used to the new tech.

You need to prove the platform’s impact, so a stable starting point is necessary for others to take your measurements seriously. Stakeholders may dismiss your improvements if they think it’s due to teams getting familiar with the technology and business domain.

Platforms must prove their impact on existing teams and the tech stacks they use. There will be many opportunities to merge technologies where teams solved a problem differently. Sometimes introducing a new tool for deployment or monitoring can remove pain from many teams at once. It’s crucial you can show the platform’s benefits.

Underinvested platforms

Once you’ve introduced a platform into your organization, it needs long-term investment to keep it usable. When a platform team disbands after delivering a product, the platform can become an anchor that adds drag to all who depend on it.

The context that led to the decisions you baked into the IDP platform can change, leaving teams stuck with old choices that no longer work. The platform will become so difficult to use that teams must migrate away from the platform to get their work done.

Platforms need to continually improve. You need to add new golden pathways and phase others out to ensure the platform continues to benefit the business.

You should limit the platform’s ambition to match the long-term investment levels.

Stretched platforms

If your organization has a fragmented technology landscape, it can be tempting to solve too many problems with the IDP. Suppose you support all existing tools and technology. In that case, the platform will become stretched too thin to be useful. Your platform team will be even more overloaded than the developers were.

A common cause of stretched platforms is measuring the platform team on internal market share. Adding more golden pathways is one way to increase market share, as the platform can help more teams. However, this defeats the goal of using the pathways to encourage consolidation around good technology choices.

MONK metrics can help balance market share with developer satisfaction and outcomes. This can help you avoid the problem of stretched platforms.

Other signs of trouble

Alongside common anti-patterns, you should watch for other signs of trouble. Common gaps can open when an IDP becomes available.

Self-service cost problems

When you make a platform with self-service options, you must ensure a strong story around how you track and control cost. If every developer spins up production-like environments, costs could rise out of control.

Any button that makes spending easy should include a feature to limit that spending automatically. This means test environments have short lifespans, and there’s a cap on concurrent environments.

There’s an opportunity for the platform team to bridge with the finance team on this subject.

General-purpose DevOps platforms

When platform decisions happen away from technical knowledge, there’s a temptation to try an all-in-one DevOps platform to solve all problems.

This doesn’t align with Platform Engineering or DevOps. If you buy a single general-purpose tool, you limit the platform’s capability and your continuous improvement abilities. The features of the tool constrain the solution for all problems.

Instead, you should design the platform as an internal product that solves specific pain points of development teams. Rather than opting for general-purpose tools, select best-in-class tools where you need them. You’re also likely to create reusable libraries that make it easier for developers to align with unified logging and telemetry design.

A golden pathway is not a tool, though it makes sense to increase the platform’s capability by leaning into specialist tools for builds, deployments, and monitoring.

Flexibility has two sharp handles

While platform teams aim to help development teams move towards a standard landscape, edge cases will always exist. A rigid platform will force development teams to exit and find alternatives. Talking with teams and listening for edge cases allows platform teams to counter edge cases and keep internal market share.

Equally, accommodating every scenario without any consolidation into golden pathways causes other problems. It could swamp the platform team with the complexity and cognitive load that Platform Engineering should remove. The best software products have a clear focus. They don’t try to do everything.

You should design the platform to allow teams to add edge cases without necessarily bringing the edge cases into support. Edge cases shouldn’t force people off the platform or increase the platform’s surface.

Don’t attempt to support every tech stack in the business. When you report on platform market share, highlight:

  • Total market size (all developers)
  • Size of the supported market (all developers using a tech stack supported by a golden path)
  • Market share (the number of developers using the platform)

Tracking the total number of developers, the number of developers who could use the internal developer platform, and the adoption of the platform are key metrics

The market share chart shows:

  • The total size of the internal developer market
  • How much of the market could use the internal developer platform
  • How many chose to do so.

You can use the chart to guide decisions for new golden pathways and track the platform’s appeal to its intended audience.

Platforms can’t solve culture

It’s easy to create a team. Building the interactions that make them successful is far more work. Team Topologies describe this using a combination of team and interaction modes.

If you create a platform team without paying attention to the design of interactions, it will fail.

Software delivery requires a high-trust, low-blame culture to succeed. Platform Engineering is no different. Culture is one of the biggest predictors of performance. Find out more about DevOps culture.

Building a platform isn’t a project. If you create an IDP, it must be a long-term strategy. When a platform is on-hold it limits the performance for all its users. This is worse than not creating a platform in the first place.

You need solid team design, clear interaction modes, and a platform as a product approach to build a good IDP.

How to do Platform Engineering well

Now you’ve seen many ways Platform Engineering can fail, it’s time to look at what success looks like.

In the Puppet State of Platform Engineering Report, 94% of respondents found the concept helped the organization realize the benefits of DevOps. If you have strong performance against the DORA Metrics, you’re likely to succeed with Platform Engineering too.

The internal developer platform is a product

The most critical sign of good Platform Engineering is that it’s run as a product. Developers are customers, and the platform should solve their problems to win internal market share.

If doing this well, you’ll have regular contact with developers, and measurable platform success criteria. If you have a product manager, the don’t only maintain a backlog. They should raise awareness of the platform and network with other departments to incorporate their needs. They help developers satisfy security, governance, risk, and compliance needs.

The platform needs a strong vision that inspires good decision-making. The platform shouldn’t support bad choices to gain internal market share. The product manager should capture decisions to refer to later on. It’s common for good decisions to get reversed when everyone’s forgotten the reason. A register of decisions can prevent mistakes.

By treating the platform as a product, you must ensure documentation quality stays high. High-quality documentation is crucial to product adoption.

Having a product-based approach and running a competitive product are vital signs of good Platform Engineering.

Use DevOps to build the platform

DevOps combines the following to increase software delivery performance:

  • Transformational leadership
  • Lean product management
  • Continuous Delivery
  • Organizational culture

The same capabilities will increase the likelihood of creating a valuable platform.

You don’t need to create a fully formed golden pathway immediately. Instead, solve 1 wide-reaching problem and iterate to success.

A typical pattern is to make it easy to:

  1. Deploy common approved tech stacks
  2. Provision infrastructure and create environments
  3. Collect telemetry and error logs and track systems in Production

The underlying pattern is to find and abstract a complex area so the development team doesn’t get lost in the weeds. In some cases, an off-the-shelf tool will help with this. In other cases, the platform team might create a facade to simplify a tricky area.

The goal should be to start with the thinnest viable platform that supports developers where they most need it. It’s better to expand features slowly. Pick the areas with the most significant impact rather than trying to solve all problems.

For example, if your toolset has large JSON, YAML, and XML configuration sets, you could provide a unified format offering fewer options. Developers then only need to adjust 1 small file in a single format instead of many large files in different formats. You should provide considered default options for all the settings you don’t expose through the platform.

Another popular option is putting different tools and workflows behind a simple API for developers.

Make good build or buy decisions

Successful platform teams know when to build something from scratch and when to bring in a specific tool for the job. Many best-in-class tools for Continuous Integration, deployment automation, and monitoring offer features to the IDP.

It would be best to avoid general-purpose solutions that try to do everything, as they can become a limiting factor. A crucial platform capability is swapping out a tool without creating work for developers. Imagine changing your event logging tool. Without Platform Engineering, it may involve a list of applications to update and schedule many similar changes. The internal developer platform should lower the cost to change this by managing the swap behind the platform API.

No matter how you compose off-the-shelf and custom components, aim to:

  1. Keep the developer in their preferred place, usually a text editor or IDE.
  2. Make everything work seamlessly together
  3. Provide snap-in solutions to common problems (authentication, logging, monitoring, for example)

A team that brings in specialized tools will create a more robust platform faster than when building everything from scratch.

Create the network

When developers adopt your platform, they should also become part of your customer network. The teams using the platform can help each other when they get stuck. Your long-term customers can share how they have used the platform to solve their problems. That can inspire other developers to adopt or extend their use of the IDP.

You can also create opportunities for different teams to discuss typical pain points the platform could solve.

Run like open source

Let the platform’s users contribute to it; they are all developers, after all! This is ‘inner sourcing.’

This means more than accepting code contributions. It also means you need to:

  • Publish contribution guidelines
  • Keep contribution documentation up to date
  • Think about how a suggested change may impact other customers

You can find more ideas at InnerSource Commons.

Keep an eye on progress

You need to track the platform’s impact to inform product management and ensure continued investment.

The MONK metrics are an excellent place to start. The MONK metrics are:

  • Market share
  • Onboarding time
  • Net Promoter Score (NPS)
  • Key customer metrics

Read more about the Platform Engineering MONK metrics.

Summary

There are several anti-patterns as well a strong pattern of successful Platform Engineering.

You’ll be more likely to succeed if you run the IDP as a product with lean product management techniques. You should capture the positive outcomes development teams achieve to encourage long-term funding from the organization.

Platform Engineering solves a particular set of problems. Team topologies provide many ways to design your team structures if Platform Engineering isn’t the right approach for you.

Further reading

  • Team Topologies by Matthew Skelton and Manuel Pais (2019)

Help us continuously improve

Please let us know if you have any feedback about this page.

Send feedback

Categories:

Next article
DevOps metrics