Handling languages and accents

PolyAI voice assistants are able to accurately take down valuable information in any language and understand accents easily.

Michael Chen VP of Strategic Alliances and Corporate Development

Jan 7, 2021

6 min

Overview

What are multilingual voicebots?

Let’s face it. Understanding your customers isn’t always easy. They tell stories, go off track and they often can’t think of a good way to explain an issue to you. And those are just the customers that speak the same language as you.

Even for humans, it can be really difficult to understand people with strong or unfamiliar accents. Call center workers receive accent training to understand 35 different variations of English alone.

And what about communicating in different languages? For multinational organizations, speaking to customers in different languages means going the extra mile to hire multilingual staff, and making sure that calls actually go through to the agent who can speak the caller’s language.

Multilingual voicebots are advanced conversational AI systems designed to handle customer interactions in multiple languages. They help businesses enhance engagement, improve customer satisfaction, and expand their reach by offering seamless communication across diverse markets in real time.

These sophisticated bots leverage natural language processing (NLP) and machine learning to comprehend, interpret, and respond to spoken or written customer queries across different languages. Unlike traditional single-language bots, AI-powered multilingual voicebots can seamlessly transition between languages within the same conversation, delivering a more accessible and inclusive user experience.

They are essential for businesses that operate in global markets or serve a diverse customer base, as they enable companies to deliver consistent, high-quality support regardless of the customer’s preferred language.

The language and accent challenges for voicebots

Organizations in English-speaking countries have an inherent advantage. English was very likely the first language used on the internet, which means more training data which leads to more accurate predictions by software. All this means that machines can ‘understand’ better in English.

Speech recognition has advanced leaps and bounds but it’s not perfect. The exact same conversation with a New Yorker versus an Australian versus a Serbian speaker of US English will all leave a slightly different text transcript, which is what trips up most voice assistants today. This leads to the much dreaded “sorry, can you repeat?” loop for even native English speakers.

For organizations in non-English-speaking countries or those that serve multicultural customers, the challenge is two-fold. Firstly, speech recognition solutions are not as accurate in other languages, immediately making things more difficult. Secondly, less training data in a particular language acts as a barrier to better accuracy of conversational models in that language. This is why it is much harder to build great conversational experiences for customers in say, Italian, Latvian, or Singlish than it is in English.

But it’s not hopeless. From day one, PolyAI has had a focus on creating voice self-service experiences in any language, regardless of accents or slang. It’s our goal to deliver enterprise-ready voice assistants that match and excel household names – and our products achieve just that. We are purpose-built for multilingual voice applications, which means we do a few things differently...

Technology

Speech recognition optimization

Some speech recognition solutions are better at understanding particular languages and accents than others. For example, the best solution for Polish may be different from the best solution for Thai.

Flexibility here is key as speech recognition solutions continue to improve, which is why we test different speech recognition providers to find the best one for each project. On top of this, we use machine learning to add an additional layer of optimisation to each stage of a conversation. This ensures the most accurate transcriptions regardless of call quality or accent.

Think of this as the generative AI equivalent of accent training, applied phrase by phrase.

Pre-training

Pre-trained speech encoder in multiple languages

Our state-of-the-art machine learning model is pre-trained on over a billion English conversations, which gives us a world-class foundation for natural language understanding.

We’re constantly building upon this foundation by pre-training our model in over 15 other European and non-European languages. Instead of building new models for different languages, we incorporate new languages on-demand in a matter of weeks. This enables us to achieve the fastest time to market for voice assistants that are truly capable of conversing with real customers in their native languages and offering elite multilingual support.

Value Extraction

A multilingual value extractor

Our proprietary value extractor, ConVEx , is also trained in multiple languages.

This means that PolyAI voice assistants are able to accurately take down valuable information, such as names and addresses, in any language.

Our research has shown that our multilingual approach outperforms monolingual models. For example, by teaching our model German on top of its English foundation, our voice assistants are better at identifying information given by callers compared with a model trained just on English as well as a model trained just on German. This mirrors similar findings by other leading tech companies like Facebook.

All this is to say, our voice assistants are better at collecting information from callers in different languages, making them uniquely suited for self-service experiences.

Future

The future of multilingual customer service

Whether you’re an organization trying to boost self-service for English-speaking customers or non-English-speaking customers, we see proof-of-concepts falling flat with real customers due to common issues all relating back to these three elements:

The ability for speech recognition to deal with accents and imperfect signals;
Robustness of speech encoders in each language; and
Accuracy of value extraction when dealing with natural and informal ways of speaking.

These barriers are not insurmountable; voice technology is available today to build multilingual self-service experiences, but it does require an extra level of craftsmanship. Look for companies like PolyAI who have both the proprietary technology as well as the research expertise to help you create new self-service experiences for your customers in any language.

Offer premium customer experiences across 50+ languages with PolyAI

FAQs

Multilingual voicebot FAQs

What is a multilingual voicebot?

A multilingual voicebot is an automated software application that understands and responds to spoken language in multiple languages. These voicebots leverage natural language processing (NLP) and speech recognition technologies to interact with users, provide services, answer questions, and perform tasks in various languages.

What are the benefits of multilingual conversational AI?

Multilingual AI voicebots offer several significant benefits, particularly in enhancing user experience and expanding reach. Here are some key advantages:

Enhanced customer experience
Customer support
Global reach
Market expansion
Reduced need for human agents
Scalability
Operational efficiency
Data collection and insights
Competitive advantage
Compliance and localization

What are the challenges of implementing a multilingual voicebot?

Implementing a multilingual voicebot for a contact center can have several challenges that should be addressed to ensure its effectiveness and reliability. Here are some of the key challenges:

Language barriers and variability
Natural language understanding (NLU)
Speech synthesis quality
Data requirements
Integration and maintenance
Cultural sensitivity
Performance and scalability
User experience design
Security and privacy

To address these challenges, you’ll need advanced technology, extensive linguistic and cultural expertise, and continuous refinement based on user feedback and linguistic trends.

Resources

See all resources

Case Study

Agent Studio

Healthcare

Booking & reservations

Resources

Company

Resources library

Customers

Product

Industries

Use cases

Resources

Resources

Company

Build an agent

Handling languages and accents

What are multilingual voicebots?

The language and accent challenges for voicebots

Speech recognition optimization

Pre-trained speech encoder in multiple languages

A multilingual value extractor

The future of multilingual customer service

Multilingual voicebot FAQs

What is a multilingual voicebot?

What are the benefits of multilingual conversational AI?

What are the challenges of implementing a multilingual voicebot?

Resources

How Fogo de Chão achieved 95% customer satisfaction with PolyAI

How the Melting Pot generated $300k from after hours bookings with PolyAI

Fogo de Chão selects PolyAI to bring hospitality to every call

PolyAI and OpenTable: Now accepting reservations over the phone!