Gemini 3.1 Flash Live: Making audio AI more natural and reliable

Our latest voice model has improved precision and lower latency to make voice interactions more fluid, natural and precise.

The Gemini emblem sits next to text reading 'Gemini 3.1 Flash Live'. The background has blue, multicolored dots making up a microphone icon

Listen to article

[duration] minutes

Today, we’re advancing Gemini’s real-time dialogue capabilities with Gemini 3.1 Flash Live, our highest-quality audio and voice model yet. It delivers the speed and natural rhythm needed for the next generation of voice-first AI, offering a more intuitive experience for developers, enterprises and everyday users.

3.1 Flash Live is available across Google products:

• For developers in preview via the Gemini Live API in Google AI Studio • For enterprises in Gemini Enterprise for Customer Experience • For everyone via Search Live and Gemini Live

For developers: Robust reasoning and task execution

We’ve improved 3.1 Flash Live’s overall quality, making it more reliable for developers and enterprises to build voice-first agents that can complete complex tasks at scale. On ComplexFuncBench Audio, a benchmark that captures multi-step function calling with various constraints, it leads with a score of 90.8% compared to our previous model.

ComplexFuncBench audio bar graph

BigBenchAudio bar graph

On Scale AI’s Audio MultiChallenge, Gemini 3.1 Flash Live leads with a score of 36.1% with “thinking” on. The benchmark specifically tests complex instruction following and long-horizon reasoning amidst the interruptions and hesitations typical of real-world audio.

AudioMultiChallenge bar graph

3.1 Flash Live also has improved tonal understanding to deliver more natural dialogue. In Gemini Enterprise for Customer Experience, it’s even more effective at recognizing acoustic nuances like pitch and pace than 2.5 Flash Native Audio. It’s also better at dynamically adjusting its response to users' expressions of frustration or confusion.

3.1 Flash Live lets you build voice-ready agents that handle complex tasks in noisy environments.

Illustrative demonstration built with Gemini 3.1 Pro, powered by Gemini 3.1 Flash Live.

3.1 Flash Live lets you use your voice to vibe code and quickly iterate.

Illustrative demonstration built with Gemini 3.1 Pro, powered by Gemini 3.1 Flash Live.

Companies like Verizon, LiveKit and The Home Depot have given positive feedback on 3.1 Flash Live in their workflows, highlighting its improved, natural conversation.

Quote from The Home Depot

Quote from Verizon

Quote from LiveKit

Quote from Wavera

Quote from Stream

Quote from YouTube

For everyone: More natural and intuitive interactions

In Gemini Live and Search Live, the 3.1 Flash Live model delivers more helpful and natural responses, whether you’re asking quick daily questions or engaging in more complex conversations.

With the 3.1 Flash Live model under the hood, Gemini Live delivers faster responses compared to the previous model and it can follow the thread of your conversation for twice as long, keeping your train of thought intact during longer brainstorms.

3.1 Flash Live is also inherently multilingual, which enables this week’s global expansion of Search Live. With this launch, people in more than 200 countries and territories can now have real-time, multimodal conversations with Search in their preferred language.

Try Gemini 3.1 Flash Live

All audio generated by 3.1 Flash Live is watermarked with SynthID. This imperceptible watermark is interwoven directly into the audio output, allowing the reliable detection of AI-generated content to help prevent misinformation. For more information on our approach to safety and responsibility, see the model card.

Experience the naturalness and reliability of 3.1 Flash Live, starting today. We look forward to seeing how you interact and build with it.

AI news brief archive