Test-Time Deep Thinking to Explore Implicit Rules
A new framework called Test-Time Exploration (TTExplore) aims to improve the performance of intelligent agents in environments with implicit rules. The framework utilizes a thinker component to analyze interaction history and guide an actor, addressing challenges in evaluating deep reasoning. Experiments show that the approach enhances agent performance significantly, demonstrating the importance of reasoning about implicit rules.
- ▪TTExplore is designed to help agents navigate environments with hidden constraints.
- ▪The framework includes a thinker component that infers implicit rules from interaction history.
- ▪Experiments indicate that TTExplore improves baseline agent performance by 14-19 points.
Opening excerpt (first ~120 words) tap to expand
Computer Science > Artificial Intelligence arXiv:2605.24828 (cs) [Submitted on 24 May 2026] Title:Test-Time Deep Thinking to Explore Implicit Rules Authors:Wentong Chen, Xin Cong, Zhong Zhang, Yaxi Lu, Siyuan Zhao, Yesai Wu, Qinyu Luo, Haotian Chen, Yankai Lin, Zhiyuan Liu, Maosong Sun View a PDF of the paper titled Test-Time Deep Thinking to Explore Implicit Rules, by Wentong Chen and 10 other authors View PDF HTML (experimental) Abstract:With the continuous advancement of Large Language Models (LLMs), intelligent agents are becoming increasingly vital. However, these agents often fail in environments governed by implicit rules--hidden constraints that cannot be observed directly and must be inferred through interaction.
…
Excerpt limited to ~120 words for fair-use compliance. The full article is at arXiv cs.AI.