Stylized laptop screen showing Octopus logo connected to cogs in the cloud, with a clipboard to the right.

Focus on your end users when creating AI workloads

Bob Walker

Recently, I attended a conference targeted at CIOs, CTOs, and VPs of Technology. As expected, there were many sessions on AI and how it can help companies be more efficient. The example given was the well-known use of AI in the hiring process; using AI as gatekeepers to quickly weed out all the unqualified candidates. “Your human resources people won’t have to wade through so many CVs and phone screens!”

That use case improves the efficiency of human resources or your people team. But that efficiency comes at the cost of the end users, the people you are trying to hire. Everyone hates how AI is used in hiring processes today. Phrases like dystopian and Orwellian are common. In this article, I’ll discuss why focusing on both your AI feature’s beneficiary and end users is essential.

Beneficiary User vs. End User

A beneficiary user is a person who benefits from leveraging AI. The end user is the person who will use an AI feature to accomplish a specific task.

Returning to the hiring process, the beneficiary user is responsible for wading through CVs and performing the initial phone screen. The end user is the person submitting their CV. The person in charge of going through CVs benefits from AI by offloading the repetitive work of screening unqualified candidates. Imagine a job posting for a senior .NET developer, but 30% of CVs submitted only include project manager experience. You might think I’m exaggerating, but you’d be surprised. As a former hiring manager who had to wade through CVs, I was shocked by how many people were “CV Bombing” - applying for as many positions as possible.

Looking at Octopus Deploy, the beneficiary of our AI Assistant is the platform engineer. The end user is the developer who uses the assistant to accomplish a particular task. For example, you can ask the Octopus AI Assistant why a deployment or runbook run failed. The AI Assistant will look at the failure message, and using our knowledge base, our docs, and the web, will come up with a reason why the failure occurred and suggestions on how to fix it. Assuming the suggestion is correct, the developer can quickly self-serve a solution without involving the platform engineer. The platform engineer benefits because they can focus on high-value tasks instead of helping debug a specific deployment failure. If the platform engineer didn’t know the answer, they’d go through our docs or do a Google search.

Now that we understand the two kinds of users, let’s examine what happens when a person is both the beneficiary and the end user.

Learning the wrong lessons from the success of ChatGPT

ChatGPT and similar tools are unique because their users are both the beneficiary and the end user.

One of many benefits of ChatGPT is that it is an evolution of search engines. Before ChatGPT, you did a Google search, which returned a list of results. The search engine ranked the results for you. They had complex algorithms to find the best results based on their internal ranking system. A cottage SEO industry (Search Engine Optimization) sprang up to get higher results. ChatGPT changed that by providing you with answers curated from the content of many websites.

For common questions, with many sources agreeing on the same answer, the results between Google and ChatGPT are close. ChatGPT is not infallible; once, it insisted that Omaha, Nebraska, was 29 nautical miles from Chicago, Illinois. Google can be more accurate, but that is a result of maturity. They’ve had 25 years to improve and iterate their search results algorithm.

ChatGPT is popular because of the interface. It is very similar to the Google Search box. The results are where they differ. ChatGPT collates information and generates an answer that is easy to read. In addition, Google Searches are very transactional: search, get a result, move on with your day. With ChatGPT, the sessions are interactive. You can ask additional questions, and ChatGPT remembers the entire conversation.

I’m only focused on ChatGPT’s question/answer aspect. I know it can do so much more, including generating content, images, composing songs, and more.

Unfortunately, companies seem insistent on learning the wrong lessons when analyzing popular trends. They see that “people like prompts and providing answers or content to them. Let’s do that for [insert use case here]!”

An awful user experience and its impact

That wrong lesson has its roots in computer graphic adventure games from the 1980s/early 1990s.

My first computer game was Space Quest III from Sierra. Like computer games of that era, I typed in commands to get the on-screen character to act. There was no help guide or tutorial. I had to figure it out. My brother and I spent weeks trying to escape the first area. We had to find the magic set of commands to execute in a specific sequence in specific areas.

Last year, I started the multi-month process of changing banks from a regional to a national bank. The national bank offered a high-yield savings account, while the regional bank didn’t. I had to call the national bank a few times. They have followed the latest trend in phone support. Dial the number, and a human-sounding voice asks you what you need help with. Too often, the response is “I’m sorry, I didn’t get that” or “I didn’t understand.”  I needed to know the magic phrase to get help. There was no clear escape hatch to get to an operator.

Their online AI help agent was no better. The AI help agent was trained on their public documents. If the answer wasn’t in the documents, it couldn’t help me. Often, it referred me to calling their support line, creating an endless cycle of frustration.

That experience was so bad that I went back to the regional bank. They proudly promote that you’ll talk to a real person when calling for help. I would rather lose thousands of dollars over many years than deal with the national bank’s awful AI-based help system.

I’m not the only one who hates talking to AI chatbots for support. The Commonwealth Bank of Australia (CBA) recently reversed its decision to eliminate jobs after introducing an AI-powered help due to poor customer experience.

Augmenting the end user experience

The problem is that, just like humans, AI makes mistakes. Without appropriate settings, it will insist that it is correct. Where humans and AI differ is that AI is “book smart” but not “street smart,” but people can be “book smart” and “street smart.”  That means that humans use a combination of experiences and acquired knowledge to make decisions. Humans learn and evolve. At the same time, AI needs to be retrained. The best analogy came from Neil deGrasse Tyson on a recent interview with Hasan Minhaj - think of AI like Albert Einstein, but he is locked in a box. It is a sensory deprivation tank, where he does not know the outside world. Someone asks him random questions, and he responds with his current knowledge. He has no context outside the knowledge he has acquired before going into the box.

As a result, AI struggles with complex decisions. It doesn’t do well when something is outside the expected parameters. In a recent study from Carnegie Mellon University and Duke University, AI Agents are correct 30 to 35 percent for multi-step tasks. And the results depended on the model used, with GPT-4o achieving an 8.6% success rate. In a recent study by Apple Computers, many popular LRMs (Large Reasoning Models) models couldn’t handle puzzles (Tower of Hanoi, Checker Jumping, Block World, and River Crossing) once the number of pieces increased beyond simple examples. Today’s AI still has to undergo many more evolutions to become similar to Tony Stark’s Jarvis in the MCU.

I’m not against using AI. Far from it. However, it’s essential to understand its limitations when designing an end-user experience.

We’ve been very methodical in finding the proper use cases for our AI Assistant. We looked at how AI could augment the user experience. That means the Octopus AI Assistant is not intended to replace the current end-user interface. That would result in a sub-par experience, the opposite of augmentation.

The challenge we wanted to solve was surfacing the correct information for the users at the right time. We wanted to let the user ask AI for help and not annoy them with unwanted pop-ups or suggestions. We didn’t want to create Clippy 2.0 in the product.

Knowing that, our four use cases for the AI Assistant are:

  1. Deployment Failure Analyzer: Read the logs of a failed deployment and offer suggestions to fix the issue.
  2. Tier-0 Support: Provide answers to end-users for common Octopus-related questions. For example, “summarize this deployment process” or “what’s a project?”
  3. Best Practices Analyzer: Using Octopus Deploy’s strong opinions, review the user’s instance to find areas for improvement.
  4. Prompt-Based Project Creation: Using templates provided by Octopus Deploy, create a new project to deploy to specific deployment targets.

Interestingly, you don’t need AI to list the first three items. I can take a deployment failure, do a Google search, and likely produce similar results. Or, I can use our Octopus Linting tool, Octolint, for best practices. AI is short-cutting all of that by collating all that information and surfacing it to the user. It’s enabling self-service for the end user.

But just as necessary, if the AI assistant can’t help, users can still ask their DevOps or Platform Engineers for help.

That is very different from using AI in hiring or AI-based help agents. They are replacement end-user interfaces. They don’t augment the user experience. Instead, they act as pseudo-gatekeepers to the hiring managers and provide support. They only focus on reducing the load for the beneficiary users. Most likely as a way for companies to cut costs or keep demand for additional headcount down. Unless you know someone at the hiring company or the magic phrase for AI Agent-based help, there are no alternatives.

But end users hate that experience. I believe that is one of the main reasons why IBM found that only 25% of AI initiatives have delivered the expected ROI over the past few years.

Considerations for the end user experience

When designing the Octopus AI assistant, we started with multiple questions about augmenting the end-user experience. We didn’t want to “sprinkle AI” into the product and claim we had an AI strategy.

  1. What problem is the AI feature attempting to solve for the end user?
  2. What is the fallback when the AI feature encounters an unknown use case?
  3. What is an acceptable level of accuracy for the AI feature?
  4. If the response is wrong, what is the escalation process for the end user?
  5. How will the functionality be discovered?

The answers for the deployment failure functionality of the AI Assistant are:

  1. Often, failures result from an incorrect configuration, transient error, bug in the script, permissions, or some other common problem. In many cases, it is outside the direct control of Octopus. Surface the information to the user to enable them to self-service the fix and decrease the recovery time.
  2. Provide a generic answer and encourage the user to contact Octopus Support or their internal experts.
  3. Reasonable accuracy is expected. Various conditions outside the control of Octopus Deploy can cause errors. Provide multiple suggestions using publicly available documentation. If none work, encourage the user to escalate to a human.
  4. If the response doesn’t help, provide a link to Octopus Support or to contact their internal experts. In either case, they will escalate to a human.
  5. When navigating to a failed deployment or runbook run, the Octopus AI Assistant will provide a suggestion that the user can click on to get the answer.

The focus has been “How can we take what we have and make it better?”, not ” How can we ensure that Platform or DevOps engineers are never bothered again?”

Conclusion

When an AI feature has a beneficiary user and end user, focus on providing a fantastic experience for the end user. Augment the end-user experience. But assume that at some point the AI will be incorrect (just like a person is incorrect), and offer a clear escalation path. Despite the many advances in AI, experienced people can handle complex scenarios much better. When the end-user isn’t considered, and the only focus is “improving the bottom line,” it creates an inferior replacement for an existing experience. End users will only put up with so much before they decide to change.

Bob Walker

Related posts