Why do LLMs ramble on and on?
About the show
Hosted by Nikola Mrkšić, Co-founder and CEO of PolyAI, the Deep Learning with PolyAI podcast is the window into AI for CX leaders. We cut through hype in customer experience, support, and contact center AI — helping decision-makers understand what really matters.
Never miss an episode
Summary
Large language models are powerful tools — but why do they ramble on and on instead of just getting to the point?
In this episode of Deep Learning with PolyAI, Nikola Mrkšić sits down with Oliver Shoulson, one of PolyAI’s original dialogue designers, to unpack why LLMs talk the way they do, and what that means for building natural-sounding voice AI.
They dive into:
- Why LLMs are trained to over-explain, and how that breaks spoken dialogue
- The subtle design choices that can make conversations feel human — without trying to fool anyone
- The debate over anthropomorphism: should AI agents admit they’re AI, or lean into human-like traits?
- How in-house LLMs can be fine-tuned to sound less like overeager interns and more like trustworthy teammates
Voice AI doesn’t need to trick us into thinking it’s human, but it does need to feel usable. This episode explores the hidden details of dialogue design that decide whether AI earns customer trust, or just frustrates them with endless rambling.
Learn more about how to avoid LLM slop in Oliver's blog post: https://poly.ai/blog/how-not-to-talk-like-an-llm/
Key Takeaways
- Transparency vs. trust: Early research showed that introducing an agent as AI can trigger “representative, representative” responses, but with the right design, call outcomes and satisfaction are just as strong — sometimes better — when the system is upfront.
- LLMs talk too much: Large language models tend toward verbose, over-explanatory answers. That works for text, but in voice-based, task-oriented dialogues, it breaks the natural flow, making interruptions necessary.
- Why PolyAI builds its own LLMs: In-house models are fine-tuned to avoid unhelpful text habits, adapt to conversational norms (like politeness and brevity), and handle linguistic nuance across languages and dialects — something off-the-shelf LLMs don’t optimize for.
- Design for usability, not mimicry: The goal isn’t to trick people into thinking the agent is human, but to design usable, cooperative, and trustworthy systems. PolyAI’s dialogue design ethos: help users forget they’re speaking to a machine without pretending it’s a person.
Transcript
Nikola Mrkšić
00:11 – 00:47
Hello, everyone, and welcome to another episode of deep learning with PolyAI. Today, I’ve got Oliver Shoulson, who is one of our original dialogue designers and a guy that I’ve had many, amazing conversations with over the years, and he’s taught me a lot.
And we’ll we’ll talk about some of the things that we had kinda, like, worked on, but we’re here to talk about dialogue designs, anthropomorphism, k, anthropomorphism, and, whether or not I should say it’s human or not and what putting an LLM as this kind of, like, main constituent part of its brain does to the flow of the conversation. So, Oliver, welcome to the podcast.
Oliver Shoulson
00:47 – 00:49
Thank you so much.
Nikola Mrkšić
00:49 – 01:18
It’s good to see you. And, you know, I think, like, when we discussed talking, you know, about this and this episode, we kinda, like, I think immediately went back to the kinda, like, wonderfully wonderfully formatted kinda, like, late tech paper where, there was just stuff around, like, you know, should an AI say it’s human or not, and all the research we we’ve done, beforehand on it.
So maybe we can start with just kinda, like, the broad strokes of, like, that discussion and see where it takes us.
Oliver Shoulson
01:18 – 02:01
Yeah. Sure.
I mean, this is a topic that we did a fair amount of, like, in house research on in the sort of the pre LLM days, where this question of, you know, should a bot introduce itself as a bot? Should we give it a human-like name? Should we give it a name that’s sort of not human like, like your classic like Siri or, you know, something that sort of lets you know that it’s like a special kind of entity, not a human. And we did a fair amount of research on this to see how this impacted user behavior, call outcomes, things like that.
Because, you know, I think there’s a lot of fear when you’re building and designing virtual assistants that if you tell people it’s a robot, that’s just gonna immediately trigger all of the past terrible experiences they’ve had with, like, terrible IVRs,.
Nikola Mrkšić
02:01 – 02:02
Yep. Yep.
Oliver Shoulson
02:02 – 02:06
and you’re gonna immediately say representative representative, like, you know, hand me off.
Nikola Mrkšić
02:06 – 02:08
Yep.
Oliver Shoulson
02:08 – 03:29
And we do see that that has a significant effect in terms of first turn hand off requests when you introduce the agent as a voice assistant, either calling it a virtual agent, mentioning AI, not mentioning AI. You know, when these studies were conducted, it was actually kind of right when chat GPT was originally being rolled out.
So we were kind of interested to see, like, is this news making it to people that there are suddenly now these much more capable AI agents on the market? And what we find find is that while there are effects on sort of in that first turn, on the frequency that people ask to be transferred to a human representative, that, you know, through clever design and by trying to win people’s engagement back, we actually see no statistically significant differences in call outcomes at all, in terms of, like, how what proportion of people are we able to contain within the system. Our customer satisfaction is the same, if not a little bit better when we are upfront about being a virtual assistant.
So it’s kind of this is one of those interesting design challenges where it’s like, we know that we’re going to encounter some people who, for one reason or another, are reluctant to interact with an AI. And how do we sort of demonstrate our capabilities to them and win back their engagement? And in doing so, we find.
that we can mitigate a lot of those negative outcomes.