2 results for "search optimization"
ARXIV CS.AI
Stochastic KV Routing: Enabling Adaptive Depth-Wise Cache Sharing
Serving transformer language models with high throughput requires caching Key-Values (KVs) to avoid redundant computation during autoregressive generation. The memory footprint of KV caching is signif…
ARXIV.ORG
A2DEPT: Large Language Model-Driven Automated Algorithm Design via Evolutionary Program Trees
Designing heuristics for combinatorial optimization problems (COPs) is a fundamental yet challenging task that traditionally requires extensive domain expertise. Recently, Large Language Model (LLM)-b…