Can Voice Agents Handle Bilingual Customers? Benchmarking Frontier ASR on Code-Switched Speech

2026-06-09

1 min read

Read Original

Get the latest tech trends every morning

Receive daily AI-curated summaries of engineering articles from top tech companies worldwide.

Endigest AI Core Summary

This benchmark evaluates how frontier ASR models perform on bilingual code-switched speech across four language pairs in enterprise voice agent systems.

•The study uses three metrics: Word Error Rate (WER) for transcription accuracy, Semantic WER (SWER) for meaning preservation, and Answer Error Rate (AER) for downstream task performance
•ElevenLabs Scribe V2 and Google Gemini 3 Flash emerge as top performers, with Scribe V2 achieving the lowest overall error rates
•OpenAI Whisper Large V3 Turbo significantly underperforms because it translates code-switched audio into English instead of preserving mixed-language transcription
•Semantic metrics reveal that language understanding capabilities matter—Gemini outperforms AssemblyAI on AER despite comparable WER, suggesting LALM advantages

Can Voice Agents Handle Bilingual Customers? Benchmarking Frontier ASR on Code-Switched Speech

Get the latest tech trends every morning

Endigest AI Core Summary

Related Articles

Claude Fable 5 is now available on Databricks, fully governed through Unity AI Gateway

Claude Fable 5: Available on Google Cloud

Anthropic Claude Fable 5 on AWS: Mythos-class capabilities with built-in safeguards now available

Meta to Use Off-Site Business Data for Feed and AI Personalization