AI Voice and AI Avatars: From Scripts to Real Time Conversation

Dan Reeves, Technical Director

March 2026

In the first of our three-part AI voice and AI avatars mini-series, Dan Reeves, Technical Director at Robiquity discusses the foundational shift from scripts to real-time conversation.

AI voice technology has existed for years, but only recently has it begun to deliver material business value.

As models mature and user expectations evolve, organisations are moving away from scripted automation and toward conversational systems that can operate in real time, understand context and take action.

In the first part of our three-part mini-series on AI voice and AI avatars, Dan Reeves, Technical Director at Robiquity discusses this foundational shift - from scripts to real-time conversation.

Finding Value Beyond the Hype

Over the last few years, we’ve all seen voice bots come and go in waves of excitement. The concept of voice agents isn’t new; we’ve all had a slightly frustrating interaction with an automated virtual assistant where we end up repeatedly stating ‘agent!’ or ‘representative!’ just to get to a human.

For a long time, the concept was impressive and the use case was valid, but ultimately the technology and the feel of the interaction prevented widespread adoption and true value realisation in the background.

That dynamic has now shifted.

Recent advances in multimodal and real-time language models have fundamentally changed what AI voice can do. What we’re seeing now isn’t just ‘better chatbots with a microphone’ - it’s the emergence of a new interface layer for business. One that is conversational, contextual, capable of taking action and feels more human.

The biggest shift, though, isn’t just technological. While model capability is the obvious driver, the more subtle shift is user expectation. Increasingly, more of us use ChatGPT, Claude, Gemini or Alex aregularly. We’re less hesitant to proceed with an AI agent if we get what we need and do not feel like we’re being stonewalled by something capable of five rigid scripted responses. At Robiquity, we believe this shift is bigger than most realise.

From Scripts to Real-Time Conversation

Traditional voice agents relied on intent classification and rigid dialogue trees, with speech recognition and text-to-speech layered around them. They were deterministic, not generative.

The process was overshadowed by conversational delay, misinterpretation and friction. Turn-taking was unnatural and the experience felt mechanical.

We’re now seeing the emergence of native audio models and increasingly seamless speech-to-speech capabilities. More importantly, these models are increasingly designed to integrate with enterprise systems such as CRMs, booking engines, policy databases and workflow engines.

Model maturity has driven more stable, context-aware interactions, reducing the need for constant intent reconfiguration or scripted dialogue expansion. Instead of building new decision trees for every edge case, organisations can support broader query types through governed model behaviour.

As integration capabilities expand across enterprise systems, the value shifts from conversation to execution. Agents can retrieve context from multiple systems, surface consolidated responses and trigger downstream actions without manual handoffs or system switching.

The impact is measurable: reduced handling time, fewer transfers, lower re-contact rates and improved cost-per-resolution. In operational terms, that translates to fewer escalations, lower support costs per interaction and improved SLA performance.

These adjustments have changed the conversation from ‘Can it talk?’ to ‘Can it safely act?’.

Safe and Responsible Deployment

The shift from scripted voice automation to real-time conversational systems is not cosmetic. It represents a fundamental change in how enterprises interact with customers and internal teams. But as voice begins to act, not just respond, a new question emerges: how do we deploy it safely, responsibly and at scale? That’s where governance becomes critical.

If you’re still relying on scripted automation, it may be time to rethink what conversational AI could really be doing for your organisation. We’re experts in this space - get in touch with us today and we can help get you started.

Alternatively, if you’ve already got an AI voice strategy in place but can’t yet act safely, measurably and with accountability - governance is where to focus next. Stay tuned for the second instalment in my AI voice and AI avatars mini-series which will discuss exactly that – “Governance is Catching Up” coming next week!

Recent posts