Changelog
Follow up on the latest improvements and updates.
RSS
####
🎤 Introducing Aura-2 – Enterprise-Grade Text-to-Speech
Millis now supports
Deepgram’s Aura-2
voices, delivering high professional and cost-effective TTS experience for real-time interactions. - Natural, accurate, and fast– Handles domain-specific pronunciations like drug names, legal terms, alphanumerics, dates, times, and currency with human-like clarity.
- Ultra-low latency– Achievessub-200ms TTFB, perfect for real-time applications.
- Enterprise-grade reliability– Built for high-volume, mission-critical voice AI deployments, not entertainment scenarios.
Aura-2
are available to use on Millis today! Simply select Deepgram as your agent's voice provider. 🚀
####
🧠 GPT-4.1, GPT-4.1 Mini, and GPT-4.1 Nano Support
Great news! Millis now supports OpenAI’s latest
GPT-4.1 family models
in the API. Here’s what’s new with
GPT-4.1 models
: - Outperforms GPT-4o and GPT-4o Miniacross all benchmarks, with major improvements ininstruction following.
- Massive context window—up to1 million tokens, withbetter long-context comprehension.
- Model options for different needs:
-
GPT-4.1 Nano
– Fastest and most cost-effective GPT-4.1 model. -
GPT-4.1 Mini
– Great balance of intelligence, speed, and cost for versatile use cases. Try the new models for even smarter and faster AI Voice Agents! 🚀

Great news! Millis now supports
Solaria
, Gladia’s most advanced real-time speech-to-text (STT) model
—built for high-performance voice platforms. What makes
Solaria
stand out: - Major upgraded quality for real-time transcriptionin English and many other major languages
- Support for 100 languages, including42 exclusive to Solaria
- Ultra-low latency(as low as 270ms) fornatural, uninterrupted conversations
- Enterprise-grade accuracy– 94% accuracy across complex use cases
Try it out by selecting Gladia as your agent's Speech-To-Text provider in the Voice config dialog! 🚀

Great news—we’ve added three top open-source voice models to the platform:
Orpheus, Sesame, and Kokoro
. They're all available now, and we encourage you to try them out!🎙
Orpheus
stands out with the most human-like voice and emotional expressiveness. You can even control the emotion with prompts. Try it with the Tara voice for the best results.Here’s a simple prompt to get you started:
You are a companion. Let's have a fun chat.
Add these tags to make conversation more natural: <laugh>, <chuckle>, <sigh>, <cough>, <sniffle>, <groan>, <yawn>, <gasp>
⚠️
Sesame
, despite its impressive demo, has a very basic open-source release for now. It struggles with hallucinations and instability—fun to experiment with, but not ready for production. We're keeping an eye out for future releases to improve this.Happy testing and let us know your feedback!
🎧
Stream Audio via Custom LLM
You can now stream audio responses via custom LLMs—perfect for tasks where you want to deliver specific audio clips to the user.
⏱️
Webhook Timeout Configuration
Added support to configure webhook timeout, letting you control how long the agent should wait for a webhook response before continuing.
🔢 DTMF Now Supported for Vonage
Agents can now send DTMF tones on calls made through Vonage, enabling smooth IVR navigation across more providers.
📞 Termination Message in API
When using the call termination API, you can now specify a message the agent will say before ending the call—making terminations feel more natural.
📄 Call Status Documentation
We've published a list of call statuses: https://docs.millis.ai/core-concepts/call-status
In Millis, users can define
phrases for tools
—these are what the agent says when calling a tool. Previously, the agent
always used the same phrase
, regardless of the language it was speaking, which could cause mismatched responses. Now, agents can intelligently adjust their responses to match the conversation's language. - You can now set a Response Modewhen defining a tool.
- With Response Mode set to "flexible", the agent willadapt the phraseto match its current speaking language.
- These phrases serve as examples, so make sure they match thelanguage of your promptfor the best accuracy.
This makes multilingual interactions
more natural and seamless
🌍 Enhanced Multilingual Support
You can now
define a list of languages
for your agent to focus on, meaning better performance and accuracy
in multilingual conversations. ⚡ Faster Call Start Time
We've optimized the system so your agent can
deliver the first response even faster
, reducing wait time at the beginning of calls. 📞 Improved Call Stability
Improved stability of
call transferring, voicemail detection, and call termination triggers
, ensuring a smoother and more reliable experience. These updates make your agents even more responsive and reliable—try them out! 🚀
Millis now supports
call transfer on both Plivo and Twilio
, making it easier to seamlessly hand off calls
to another number. More provider support is coming soon—stay tuned!
Our Web SDK now
receives tool call triggers and results
, allowing the client to handle them directly. Use case example:
- A user asks for information.
- The agent queries via tool call.
- The agent receives a list of optionsas tool call result.
- The client can now display these optionsin the web app for a moreintuitive user experience.
This makes interactions
smoother and more interactive
—try it out! 🚀Load More
→