To provide excellent customer service, artificially intelligent agents (or AI agents) need to be able to understand what customers actually want. Our human understanding of the broad variance in natural language is easy to take for granted, but for AI agents, extracting the meaning from sentences is a complex process that’s still being explored by the world’s best dialogue teams.
Most of the chatbots you’ve already seen on the market have been built on decision-tree style conversational flows, where the conversation has already been pre-scripted by the chatbot developer. Think about the number of different paths conversations can take; how can developers possibly account for them all? These chatbots don’t truly understand the semantics of the user’s query; their understanding is nothing more than deciding which step to move forward with down the previously scripted set of paths.
In this post, we’ll take a high level look at how AI agents process queries and derive the intents behind them.
Understanding what customers want
Have a listen to this demo of a customer speaking to an AI agent we made for a car-rental use case.
Rather than saying ‘please can I speak to your repairs department?’, this customer does what most people would do in this situation and explains the unique circumstances that have led her to make this call. Good AI agents need to be able to understand what customers are saying, regardless of the wording they use to express themselves.
We use a process called intent detection by which AI agents process speech and match queries to certain intents. We’ve trained our model on billions of general conversations, so it has a strong baseline for understanding the intent behind a vast number of sentences. We fine-tune the model with small amounts of domain-specific data to suit each AI agent and the intents it attends to.
Getting the right information
Once the AI agent has identified the intent behind a query, it may need to extract certain information that will allow it to complete the task. The AI agent will have a predefined set of information to obtain, and will need to guide the flow of conversation to ensure it gets everything it needs.
Value extraction is the process by which AI agents extract the relevant information from customer queries and store them against the relevant ‘slots’. While it sounds straightforward, it’s really difficult to get AI agents to understand which information goes in which slot.
For example, you might say, ‘I want to book a table for 2 at 4’. For an AI agent to understand that one number is a time and the other is the number of people, it must have been trained on data that demonstrates that ‘for 2’ means for 2 people, and ‘at 4’, means at 4 o’clock, and that if the restaurant is open from 12pm to 12am, then ‘4’ is probably 4pm.
This is a problem we’ve been able to solve to get our restaurant booking platform up and running with real customers. If you want to learn more about why it’s so difficult for AI agents to extract values, check out our blog post on the neural language understanding of people’s names.
At PolyAI, we build AI agents with great memory as default. This means they can take down information as it’s given, rather than asking end users to repeat themselves when the machine decides it’s time to record something.
What if the machine isn’t sure?
Where a human agent will naturally be aware of how comfortable they feel understanding a query, an AI agent needs to calculate its confidence at every step of the way.
Typically, the AI agent will have a confidence threshold for each intent that decides whether to move forward with the query, ask for further clarification or hand off to a human agent.
Some intents are less weighty than others, so we might program the AI agent to move forwards past relatively low confidence thresholds. For example, if someone is asking to reserve a high chair in a restaurant that we know has an abundance of high chairs, we may allow the AI agent to move forward at a lower confidence threshold than we would for someone giving more sensitive information like card details.
Based on the confidence levels, the AI agent has to be smart enough to understand when to proceed with the conversation and when to hand it off to a human agent. In other words, the AI agent must be able to understand that it won’t be able to understand, which is a complex problem in its own right!
Bringing it all together
At PolyAI, we build AI agents on top of our Encoder model. The model is trained on hundreds of millions of conversations, and the model can understand a broad variety of conversational contexts; able to handle any number of typos and figures of speech, in a large number of languages.
We use our model and clients’ knowledge bases to develop intent classifiers that are able to extract niche, domain-specific intents from complicated queries. We then employ the slot filling process to collect the data required to resolve the query. Confidence measurement means our clients can constantly monitor the accuracy of their AI agents, and make educated decisions about when conversations should be handed off to human agents.
Intent alone is not enough
Understanding what the other person is saying is only one half of the conversation. Next time we’ll look at how AI agents craft responses and trigger actions to solve customer queries.
If you’d like to learn more about automating customer service with conversational AI, get in touch with PolyAI today.