In 2024, financial services company Klarna announced that their new AI agent was handling two-thirds of customer service chats in the first month after it went live. In a time where AI agents were making news for all the wrong reasons, Klarna had seemed to leverage AI to truly transform customer service operations.
A year later, Klarna has just announced that it is reversing course and redirecting investment back into human agents.
Klarna CEO Sebastian Siemiatkowski has a refreshingly honest explanation of what went wrong: “As cost unfortunately seems to have been a too predominant evaluation factor when organizing this, what you end up having is lower quality.”
Before we dig into what Klarna could have done better, let’s give credit where it’s due. The Klarna chatbot does a great job with the kind of personalization and contextualization that has been made possible in the genAI era. As evidenced by the early results, this is one of the first well-built, high volume chatbots of the genAI age. So what went wrong?
Based on my own personal experience with the Klarna chatbot, and drawing from lessons I’ve learned working with enterprise brands to create AI agents at PolyAI, here’s my take on what happened, and what other companies can learn from Klarna’s mistakes.
Think business goals, not just contact center goals
With an implementation that is fundamentally grounded in cutting cost, contact center success metrics (containment rates, automated minutes, even CSAT) can look great in the short term – but without real intentional investment into how to improve CX with automation, you lose business.
Prospective customers researching your products/services are put off by friction and look elsewhere. Current customers become increasingly frustrated at having to go through the same steps many times, feel undervalued, and turn to competitors.
To truly understand the impact of an AI agent, you need to go beyond the organization’s current contact center KPIs and look to higher-level business metrics like conversion rates, retention rates and customer lifetime value.
Latency is everything
While the Klarna chatbot gives great personalized and contextual responses, it’s slow, taking up to 20 seconds to answer a simple FAQ.
Latency is one of the central factors in user perception of agent quality – you can give the best answers in the world, but if you don’t give them fast enough, the interaction is clunky at best, painful at worst. Outside of CX issues hurting the business long term, there’s a much more immediate problem of abandonment: leave users hanging for long enough and they’ll just give up.
Continuously optimizing for latency is a central part of developing and maintaining an AI agent. This is even more important for a voice-first company like PolyAI – not only are there simply more processes to run (audio needs to be converted to text and text back to audio either side of text generation), but users need instant feedback over the phone in a way that isn’t necessary over a text-based channel, where we expect someone to type out a response and read it through before hitting send. We’re bringing our latency down in every module week by week – in ASR, LLM, and TTS.
Speak in your brand voice
Just like people, LLMs have default phrasings they like to fall back on – this is one of the things we’re hyperaware of as a Dialogue Design team when building assistants for our clients. Some of the responses I got from the Klarna chatbot sounded like GPT – rather than Klarna.
GPT has a few giveaways for example: “I apologize for the confusion/inconvenience” at the smallest conversational breakdown, “that’s correct” where a simple “yes” would do, “If you have any more questions or need further assistance, feel free to let me know” after answering an FAQ – not to mention the infamous em-dash. Making your AI agent consistently sound like your brand, and speak in your tone-of-voice, requires both robust few-shot prompting and hard overwrites for when those pesky fallbacks do sneak in.
LLMs are great at generating relevant responses, but it’s important to put in the work to make sure those responses really sound like your brand.
Allowing users to take actions, not just speak
Sometimes, it’s enough to respond to a customer with words and words alone. An FAQ, for example, often just requires an answer.
But customer service reps don’t just answer questions, they take action to solve problems. While on the line, they update customer information, issue refunds, even offer upsells, and take payments – not just deflecting but resolving the user’s issues live. Again, let’s be fair – Klarna’s bot does well to redirect its users to relevant parts of the app for self-serve processes, and escalate to a human where necessary – but we can drive real value through automation, with ROI and CX in harmony, when we start to do real end-to-end integrations to allow customers to get things done through conversations – not just get answers.
Conclusion
Klarna took a significant risk launching their generative AI chatbot when they did, and with reported 70% containment, the effort seems to have paid off… to a point.
Some contact center and CX leaders are hesitant about implementing AI agents because of the risk of customer experience, and stories like Klarna’s will probably serve to further cement these concerns.
But it doesn’t have to be this way.
At PolyAI, we’re committed to creating AI agents that customers actually want to talk to. AI agents need to earn the right to handle customers’ inquiries. Those deploying them will need to obsess over customer experience and prove to customers that they are not only capable, but worthy of helping them.
Want to know more about how PolyAI creates AI agents that improve CX? Request a demo today.