PolyAI raises $50 million series C Read more

Five tips for writing an RFP for conversational AI for voice

April 27, 2023


PolyAI technical blog image

With thousands of conversational AI vendors to consider, finding the right solution to your CX problems is understandably challenging.

While we believe there is a more collaborative and effective approach to finding a specialist voice partner, many organizations are required to carry out the Request for Proposal (RFP) process.

We’ve seen dozens of RFPs for conversational AI from companies that want to deliver natural sounding voice assistants, but often these RFPs fail to take into account the differences between voice and text-based conversational AI solutions. A smart approach to RFP design will help you find the best solution for your business.

While many vendors offer voice solutions, they often do not have specialist voice capabilities required to deploy experiences that your customers will engage with.

In this post, we’ll explore five things you should consider when creating your RFP for an AI voice assistant, to ensure that you find the perfect balance between efficiency and experience.

1. Understand the challenges of the voice channel

How a person speaks differs fundamentally from how they type or text and requires a specialized set of technologies and capabilities.

Consider the following:

  • Accents – we all sound the same when we type, but when we speak, we have different accents, styles, and cadences of speech.
  • Background noise/poor connections – voices are muffled over the phone, and customers often prefer the voice channel over text in noisy environments like driving or when outside.
  • Synonyms, idioms, slang – we tend to speak more colloquially than we type, using everyday turns-of-phrase that may not be expected.
  • Storytelling – we tend to be more concise when typing, but in speech, we tell long stories with lots of detail, most of which is unnecessary to complete a task.

What’s more, there’s no graphical user interface for speech, so the voice assistant needs to be able to engage callers and guide them through finely crafted, concise spoken utterances. Where chatbots can use buttons and menu options to guide callers, voice assistants need to start conversations with open questions to engage callers. Customer service representatives don’t start conversations with, “If you’d like to talk about billing, say billing.” They say, “How can I help?” Voice assistants should do the same.

To give your customers the freedom to speak naturally and the confidence that they will be understood requires a solution that is designed and built specifically for conversational voice interactions.

PolyAI has developed several NLU models trained on billions of real-world conversations. Our customer-led voice assistants can understand callers whatever they say, and however they say it, enabling even the most complex questions to be successfully contained.

Some questions to consider in your RFP

  • Does the company have demonstrable success deploying voice assistants for customer service?
    – Ask to see case studies, hear call recordings, and ideally, an opportunity to call in and test yourself.
  • Does the implementation team include voice-specific design expertise? i.e. designing for spoken language.
    -Ask to have them present their voice design and implementation methodology. Ensure voice-specific guiding principles (i.e., brevity, tonality, etc).

2. Understand the voice tech stack

Poor speech recognition is behind countless bad customer experiences over the phone. How often have you heard “Sorry, I didn’t get that” from a conversational IVR?

Telephony-bound voice is inherently challenging from a speech recognition perspective. Voice solutions must account for background noise, poor quality, and various dialects and accents. All of these factors can make understanding a caller’s words difficult.

For a voice assistant to engage in customer-led conversation, it must be successful at listening, understanding, and responding to what the customer is saying.

Understanding how a vendor’s tech stack works is essential to create an effective voice solution. Some critical technical capabilities to consider include the following:

Understanding what the caller is saying

Automatic speech recognition (ASR) – Does the vendor offer additional ASR tuning mechanisms and Spoken Language Understanding (SLU), giving you the capacity to understand callers over noisy phone connections?

Natural language understanding (NLU) – Do they use Large language models (LLMs) developed for the specific requirements of customer service use cases?

Identifying callers

Alphanumeric capture – Does the vendor have a proven track record of accurately taking down alphanumeric strings (e.g., order numbers, ZIP codes, phone numbers) without requiring the caller to use the phone’s keypad?

Name recognition – Does the vendor have a solution for taking down caller’s names?

Extracting values

Entity extraction – How does the vendor extract values from long utterances and understand values even if they are given in a non-traditional format, for example, “my wife and I” = 2 people)?

Non-linear conversations

Dialogue policy – Can the solution facilitate customer-led conversations by allowing callers to go off-topic with unrelated questions and bring the conversation back to resolve the original query?

Consistent brand voice

Natural-sounding voice – Does the voice assistant have a natural-sounding voice?

Dialogue design – Is there sufficient design support capabilities to ensure scripting and voice direction have maximum impact over the phone?

Some questions to consider in your RFP

  • How does the vendor solve for speech recognition over the phone?
  • How does the vendor account for accuracy of information collection given noisy phone connections?
  • How does the vendor ensure a natural-sounding and on-brand voice that engages callers?
  • Which languages does the vendor currently support?

3. Know your project team

Often, solutions that seem low-cost and low-effort at first end up becoming overly complex in production. Additional products and solutions may need to be added, incurring hidden costs. As the build becomes more complex, deployment is pushed further and further away and in some cases, it never actually happens.

When considering a voice partner, you should understand up-front what will be required from you, your team and your company to ensure a successful deployment.


Consider how much time and input is required of your team and what additional technologies need to be built. Some vendors require training data, such as call transcripts and FAQs, to be gathered manually, which can cause an implementation to take months, if not years.

Using a low code or no code solution may seem attractive at first, but will require an in-house development and design team or consultancy partner to build. If you go this route, make sure that you understand exactly what will need to happen to ensure great customer experiences are delivered at the end.


A successful deployment requires collaboration, and a team focused on making tweaks to improve performance. Ensure that your chosen partner will be closely monitoring the system in its first weeks, ready to make changes to enhance performance.


Some vendors provide the framework for business users to build voice assistants themselves. If you are thinking of launching a voice assistant this way, be clear on your ongoing partnership and services. If this isn’t outlined in your RFP, it can lead to delays and unforeseen costs for additional functionality and services.

Many voice solutions create a significant amount of data that, when structured correctly, provides invaluable customer information, such as why your customers are calling. Be clear about how you will access this data post-launch and turn it into a usable format.

Some questions to consider in your RFP

  • What training data is required and how does the vendor expect it to be delivered?
  • How does the vendor support you through the launch process?
  • What does the process of ongoing maintenance look like?

4. Know your strengths

Designing, launching, and maintaining a voice assistant requires a team of specialists, including dialogue designers, machine learning engineers, implementation engineers, voice user interface designers, and project managers, among many others.

Your technical team’s capabilities will significantly impact your approach to launching a voice assistant and the vendor you choose to work with. It can prove costly and inefficient to attempt to build a voice assistant with a team that hasn’t been through the process before. Understanding the strengths of your internal teams will help you identify the features and services you will need from a specialist voice partner.

Some questions to consider in your RFP

  • What expertise and skills are expected from your side?
  • How much time will you need to give throughout the process?

5. Be mindful of your business goals

With multiple stakeholders involved in the RFP process, it is common for business goals and objectives to be clouded by conflicting priorities and opinions about what is expected from a specialist voice partner.

Keep asking what you want to achieve throughout the RFP process. It will help you ensure that your chosen vendor provides the functionality and scalability your business needs.

For example, if your goal is to improve customer insights, you will need to prioritize vendors that offer real-time structured conversational data. Or, if you’re scaling your customer service operations internationally, you may need to consider multilingual capabilities.

We often see RFPs that ask specific questions about capabilities that they believe will help them solve certain goals. But asking questions that focus more on how a vendor will solve your problems can open up new and unforeseen opportunities that will empower you to deliver something amazing.

Some questions to consider in your RFP

  • How will the vendor help you to solve [insert specific business goal]?
  • How has the vendor solved [specific business goal] for other companies?


Finding a specialist voice partner using the RFP process can be time-consuming and a little daunting. However, by understanding the technology involved and the challenges the voice channel presents, you can ask the right questions to find a vendor to help you achieve your business goals.

Ready to hear it for yourself?

Get a personalized demo to learn how PolyAI can help you
 drive measurable business value.

Request a demo

Request a demo