Moving Beyond Chatbots: How Generative AI Is Changing the Game for Customer Support

Jason Fan
6 min readFeb 22, 2023
The current state of chatbots

In recent years, customer support bots have become increasingly common as the primary point of contact for customer interactions. However, despite their widespread use, most customers do not enjoy using chatbots as their first interaction with a company. This is because these bots are often seen as gatekeepers, prioritizing the protection of the company’s interests over providing genuine assistance to customers. The emergence of modern language models, such as ChatGPT and Bing, represent a significant advancement in customer service technology, offering the potential for AI assistants that not only deflect customer inquiries but also delight them by providing high-quality, personalized support. These advancements hold promise for improving the overall customer experience and could represent the next major breakthrough in customer service.

Technology S curves for LLMs vs NLU based chatbots

While traditional NLU (Natural Language Understanding) and intent classification chatbots have been used for some time, a newer type of technology known as large language models (LLMs) is now gaining traction in the industry, especially after ChatGPT exploded onto the scene in late 2022. In this article, we will explore the benefits of customer support assistants driven by LLMs compared to chatbots that rely on NLU and intent classification. For brevity, we’ll refer to this new type of AI assistant built on top of LLMs as“generative assistants” and the NLU-based AI bots as “chatbots”.


LLMs, or large language models, use the innovative transformer architecture and are trained on vast amounts of text data to gain a more nuanced understanding of natural language than traditional chatbots. While chatbots rely on predetermined rules to understand and respond to user requests, generative assistants powered by LLMs can generate more natural and human-like responses by considering the entire conversation and predicting the next set of words. This is why when you use ChatGPT, it outputs text sequentially, rather than sending the final response as a single message.

In contrast, NLU chatbots extract intents and entities from a customer’s inquiry to provide a response. NLU chatbots are deterministic state machines, while generative assistants are probabilistic prediction machines, which has implications for how they should be used in the context of customer success teams. Generative assistants are more suitable for providing personalized and high-quality support with high deflection rates, while NLU chatbots may be better for classifying inquiries by skill, and routing them to the appropriate agents.

How generative assistants work, in a nutshell

We’ll analyze the pros and cons of generative assistants vs NLU chatbots on 5 dimensions:

  • Flexibility: their ability to handle the long tail of customer requests, including those that have never been seen before.
  • Utility: their ability to handle complex, multi-step customer requests that involve downstream actions like account updates.
  • Ease of use: how much training is required to achieve optimal performance.
  • Scaling: how much effort is needed to maintain and even improve the performance of the AI.
  • Safety: Whether the AI is truthful and benign, and how to monitor its responses.


NLU bots are limited to only handling requests they have been explicitly trained on, and will defer any requests they don’t understand to a support agent, resulting in slower response times and fewer self-serve resolutions.

Generative assistants draw on their knowledge of the company and product to handle a much wider range of inquiries, including requests from customers it has never seen before.

Generative assistants are also highly configurable. Being a probabilistic prediction machine, a confidence threshold can be set below which the AI will defer requests to a human agent. This allows CX teams to experiment with different settings and find the optimal balance between deflection rates and customer satisfaction.

Ultimately, this flexibility allows generative assistants to provide a more efficient and effective customer service experience, while also freeing up support agents to focus on more complex issues.


Both types of conversational AI require actions to be defined in order to make changes to customer data or trigger downstream events. This means that while generative assistants can handle a wider range of inputs, they still require specific actions to be defined in order to take meaningful action on behalf of the customer.

However, one key difference is that generative assistants can predict a user’s intent without having utterances defined in advance. This means that conversational designers don’t need to brainstorm 100 different ways to say “I need help”, it works out of the box. If additional specificity is needed, the LLM that underpins generative assistants can always be fine-tuned to recognizes certain phrases.

Ease of Use

NLU bots can be extremely effective, but they require training, which involves defining intents and entities ahead of time and continually tweaking them as products change. Failure to make these adjustments can result in a bot that is no longer effective at recognizing what cutomers are trying to accomplish. On the other hand, generative agents work well right out of the box, without the need for training. Intents still need to be defined, though this can often be done automatically based on content companies already have. This does mean that generative assistants require a relatively large amount of content to perform well, but luckily, most companies have enough content in their help center to fulfill this requirement.


With NLU bots, performance scales with effort. This means that every change made to a product requires new intents to be defined, new utterances to be considered, and new actions to be built out.

On the other hand, generative assistants scale with content. Each new support article, documentation page, or Github repo helps to make generative assistants better at understanding and handling customer issues. They can even go a step further and draft new articles and content based on customer conversations and internal documents like PRDs. This means that the more content available, the better generative assistants perform.


One of the challenges with large language models (LLMs) is that they tend to hallucinate, which means they can give answers confidently even if they are incorrect. On the other hand, natural language understanding (NLU) is just a classifier, so it can’t hallucinate, but it can still classify intent incorrectly, which can lead to frustration for customers.

Fortunately, new techniques are emerging that can help mitigate the impact of hallucinations by using LLMs to generate responses, but not to make decisions or handle business logic. This means that companies can still take advantage of the benefits of LLMs while minimizing the risks associated with hallucinations.

Final thoughts

One other factor that shouldn’t be overlooked is how intuitive it is for humans to understand how different types of AI chatbots work under the hood. This is where generative assistants have chatbots beat.

NLU chatbots use a state machine to keep track of their internal state. They predict the user’s intent, gather data from internal systems, and use predefined logic to determine how to modify their own state based on user input. The problem is this approach does not reflect how humans interact with each other. In contrast, generative assistants are much more similar to human agents in terms of how they operate! They start by predicting the user’s intent, then searches for relevant documentation, and finally constructs a response based on the content and the customer’s situation. If the system cannot find any useful information, it will escalate the query to a human agent. It can also generate documentation based on how the agent responded, learning from the interaction to assist the customer in the future.

If you’re looking for an approach to self-serve customer support that prioritizes flexibility, utility and ease of use and your company has help content already published, generative assistants might be a perfect way to scale up your CX org and do much more without scaling up your headcount.

If that sounds like something you’d like to try, — check out Buff! We’ve created the first generative assistant used by CX teams in production.