PAVO-Bench – 50K voice turns and an 85K-param router for ASR→LLM→TTS

Apr 28, 2026 · 2:58 PM UTC ·4 min read · 0 reactions · 0 comments · 1 view

A 50K-turn voice pipeline benchmark and an 85K-param meta-controller that cuts P95 latency 10.3% and energy 71% vs fixed cloud. TMLR 2026. - vnmoorthy/pavo-bench

Original article

GitHub

Read full at GitHub →

Opening excerpt (first ~120 words) tap to expand

PAVO: Pipeline-Aware Voice Orchestration Demand-conditioned inference routing for real-time ASR → LLM → TTS voice pipelines. PAVO treats the voice-assistant pipeline as a jointly optimizable inference graph. An 85,041-parameter meta-controller, trained with multi-objective PPO in 106 seconds, decides per turn whether to route each ASR → LLM → TTS call to a cloud or edge configuration. The empirical contribution is a characterization of inter-stage coupling constraints — quality dependencies where upstream ASR choices bound what downstream LLMs can recover from. Authors: NarasingaMoorthy VeiluKanthaPerumal (University of Pennsylvania) and Mohammed Imthathullah (Google).

…

Excerpt limited to ~120 words for fair-use compliance. The full article is at GitHub.

Anonymous · no account needed

Discussion

0 comments

PAVO-Bench – 50K voice turns and an 85K-param router for ASR→LLM→TTS

Discussion

More from GitHub