Gemini 3.1 Flash TTS is Google's text-to-speech model that enables precise speech generation through 200+ audio tags.
- •The model supports audio tags to control pacing, expression, and non-verbal vocalizations such as whispers and laughs
- •Available on Google AI Studio and Vertex AI, supporting over 70 languages with 30 prebuilt voices
- •Audio output is watermarked with SynthID to identify AI-generated content
- •Use cases include accessibility for gaming, audiobooks with dramatic expression, and enterprise applications like banking fraud alerts
This summary was automatically generated by AI based on the original article and may not be fully accurate.