Why DIY conversational AI won’t achieve performance at scale

We come across a lot of companies that are experimenting with DIY conversational voice assistant capabilities offered by the likes of Google Dialogflow, Amazon Lex or IBM Watson, or open-source frameworks like Rasa. But when we get further into discussions, most of these companies have not put their voice assistant into action with real customers. Their proof-of-concept remains under lock and key months on, interacting only through pre-scripted queries of testers selected from within the project team.

In our experience, an initial proof-of-concept that handles 3 – 5 broad intents should be ready to face real customers within 2 weeks, and be ready to expand to additional intents within 3 months after reaching a threshold level of intent accuracy and customer satisfaction.

The risks with DIY solutions

There are a number of predictable risks that can hold up the initial design process:

What happens if the voice assistant misunderstands a customer?
What if a customer raises their voice?
Is it ok for the voice assistant to cut off a customer mid-sentence?
What should be the warmth and tone of the voice assistant?

If a voice assistant for customer service is built on platforms such as Google Dialogflow or Amazon Lex, those risks are more significant because the technology is general purpose. They are built to support text chatbots, smartphones and smart speakers, across a broad number of intents; unsurprising given their parent companies. Existing DIY conversational platforms have not been optimised for the specific Automatic Speech Recognition (ASR) challenges of phone support, like line static, accents or background noise. Nor have the Natural Language Understanding (NLU) models been optimised for the nature of customer service conversations: longer explanations, digressions to other topics, interruptions and specific lexicons. As a result, achieving performance at scale has not been as simple as turning on a ‘voice’ feature.

To manage and control these risks during live deployment, enterprises have limited options. Product teams can turn features on and off, or tweak training phrases here and there but they are little more than stabs in the dark without being able to troubleshoot with the software engineers in charge of the underlying technology. When real customer calls are involved and the stakes are high, this can understandably grind any plans for a live deployment to a halt.

Truly scalable conversational AI requires an expert understanding of speech recognition, NLU and dialogue science

At PolyAI, we dedicate teams to every client to build voice assistants which are crafted to understand the conversations of specific customer journeys. Our experience has shown that human-level performance comes from close collaboration across all layers of the conversational AI stack, ASR through to dialogue management. That’s why our proprietary conversational platform was built to give us complete control to augment ASR to reduce transcription errors. It also allows us to fine-tune our NLU model to increase accuracy in critical moments – those habitual pauses, mumbles and clarifications – to make a conversation flow. We optimise dialogue management to improve the understanding of context in each conversation, making it straightforward for customers to get what they need and drastically reducing the time needed to make changes later on. During live deployments, our dialogue scientists analyse customer calls to proactively optimise performance, rolling out improvements in a matter of hours or days. These make all the difference between a narrow proof-of-concept and a live deployment that can deliver real customer and business value.

It’s expensive (and difficult!) to build any, let alone all, of these capabilities in-house, and for most companies it does not make sense. However, that should not stop companies from launching new customer experiences with conversational AI. We believe that managed services, like PolyAI with access to world-class engineering and research talent, offer the most robust path to value for voice assistants in customer service. The caveat – look for vendors that have designed their conversational platform specifically for customer service, with a proven track record in outperforming existing general-purpose conversational AI (see our research and results here, or read more here).

Why DIY conversational AI won’t achieve performance at scale

The risks with DIY solutions

Truly scalable conversational AI requires an expert understanding of speech recognition, NLU and dialogue science

Read more

Ready to hear it for yourself?

Why DIY conversational AI won’t achieve performance at scale

The risks with DIY solutions

Truly scalable conversational AI requires an expert understanding of speech recognition, NLU and dialogue science

Read more

Ready to hear it for yourself?

Request a demo