Natural Language Query to Configuration for Retrieval Agents
The paper presents a new approach called BRANE for optimizing retrieval agent configurations based on natural language queries. BRANE aims to minimize costs while maximizing accuracy by selecting the best configuration from a predefined catalog. The results demonstrate that BRANE can significantly reduce costs while maintaining high accuracy compared to traditional methods.
- ▪BRANE uses a large language model to convert natural language queries into workload-specific characteristics.
- ▪It selects configurations that maximize predicted correctness while penalizing cost, allowing for a flexible cost-quality tradeoff.
- ▪BRANE has been shown to achieve up to 89% lower costs while matching the accuracy of the best fixed configurations.
Opening excerpt (first ~120 words) tap to expand
Computer Science > Artificial Intelligence arXiv:2605.27361 (cs) [Submitted on 26 May 2026] Title:Natural Language Query to Configuration for Retrieval Agents Authors:Melissa Z. Pan, Negar Arabzadeh, Mathew Jacob, Fiodar Kazhamiaka, Esha Choukse, Matei Zaharia View a PDF of the paper titled Natural Language Query to Configuration for Retrieval Agents, by Melissa Z. Pan and 5 other authors View PDF HTML (experimental) Abstract:Modern retrieval agents expose many configuration choices -- LLM, retriever, number of documents, number of hops, and synthesis strategy -- each shaping both answer quality and serving cost. Today, these pipelines are typically hand-tuned once per workload, leaving substantial per-query optimization untapped.
…
Excerpt limited to ~120 words for fair-use compliance. The full article is at arXiv cs.AI.