Hub / Ai Research
ai-research · WeSearch
Ai Research news.
Page 2 of Ai Research headlines on WeSearch — deduped and updated continuously from 10+ editorial sources.
ARXIV CS.AI
SAGE: A Quantitative Evaluation of Socialized Evolution in Agent Ecosystems
ARXIV CS.AI
From Prompt to Service: An SLM-Based Agent Orchestration Gateway for AI-Driven Virtual Worlds
ARXIV CS.AI
Cross-Lingual Token Arbitrage: Optimizing Code Agent Context Windows via Local LLM Preprocessing
ARXIV CS.AI
Bridging Auxiliary Constraints to Resolve Instruction Following in Large Reasoning Models
ARXIV CS.AI
TSQAgent: Rating Time Series Data Quality via Dedicated Agentic Reasoning
ARXIV CS.AI
Gender-Dependent Diagnostic Substitution in LLM Medical Triage: Same Symptoms, Unequal Urgency
ARXIV CS.AI
Towards Non-Monotonic Entailment in Propositional Defeasible Standpoint Logic
ARXIV CS.AI
Diagnosing Knowledge Gaps in LLM Tool Use: An Agentic Benchmark for Novel API Acquisition
ARXIV CS.AI
From Answers to States: Verifiable Process-Level Evaluation of Chemical Reasoning in Large Language Models
ARXIV CS.AI
EvoDrive: Pareto Evolution for Safety-Critical Autonomous Driving via Self-Improving LLM Agents
ARXIV CS.AI
The DeepSpeak-Agentic Dataset
ARXIV CS.AI
SkillPyramid: A Hierarchical Skill Consolidation Framework for Self-Evolving Agents
ARXIV CS.AI
Dynamic Objective Selection with Safeguards and LLM Oversight for Financial Decision-Making
ARXIV CS.AI
Code-on-Graph: Iterative Programmatic Reasoning via Large Language Models on Knowledge Graphs
ARXIV CS.AI
Unveiling the Structure of Do-Calculus Reasoning via Derivation Graphs
ARXIV CS.AI
When to Re-Plan: Subgoal Persistence in Hierarchical Latent Reasoning
ARXIV CS.AI
Proof-Refactor: Refactoring Generated Formal Proofs into Modular Artifacts
ARXIV CS.AI
LAP: An Agent-to-Instrument Protocol for Autonomous Science
GOOGLE NEWS
How AI is Transforming Scientific Discovery While Keeping Humans at the Center - Stanford HAI
ARXIV CS.AI
BrickAnything: Geometry-Conditioned Buildable Brick Generation with Structure-Aware Tokenization
ARXIV CS.AI
Can LLMs Introspect? A Reality Check
ARXIV CS.AI
Is Agent Memory a Database? Rethinking Data Foundations for Long-Term AI Agent Memory
ARXIV CS.AI
Personalizing Embodied Multimodal Large Language Model Agents over Long-term User Interactions
ARXIV CS.AI
Constraint acquisition needs better benchmarks
ARXIV CS.AI
Your Agents Are Aging Too: Agent Lifespan Engineering for Deployed Systems
ARXIV CS.AI
Experiments in Agentic AI for Science
ARXIV CS.AI
Anchor: Mitigating Artifact Drift in Agent Benchmark Generation
ARXIV CS.AI
OmniToM: Benchmarking Theory of Mind in LLMs via Explicit Belief Modeling
ARXIV CS.AI
JobBench: Aligning Agent Work With Human Will
ARXIV CS.AI
Managing Uncertainty in LLM-Generated Procedural Knowledge for Virtual Laboratory Planning
ARXIV CS.AI
ScientistOne: Towards Human-Level Autonomous Research via Chain-of-Evidence
ARXIV CS.AI
Automatic Layer Selection for Hallucination Detection
ARXIV CS.AI
Exploiting Local Dynamics Regularity for Reusable Skills in Offline Hierarchical RL
ARXIV CS.AI
Advancing Creative Physical Intelligence in Large Multimodal Models
ARXIV CS.AI
From Static Context to Calibrated Interactive RL: Mitigating Distribution Shift in Multi-turn Dialogue with Aligned Simulator
ARXIV CS.AI
Reasoning, Code, or Both? How Large Language Models Handle Variations in Math Questions
ARXIV CS.AI
The MiniMax-M2 Series: Mini Activations Unleashing Max Real-World Intelligence
ARXIV CS.AI
Which Changes Matter? Towards Trustworthy Legal AI via Relevance-Sensitive Evaluation and Solver-Grounded Reasoning
ARXIV CS.AI
PolyFusionAgent: A Multimodal Foundation Model and Autonomous AI Assistant for Polymer Property Prediction and Inverse Design
ARXIV CS.AI
MobileExplorer: Accelerating On-Device Inference for Mobile GUI Agents via Online Exploration
ARXIV CS.AI
MedGuideX: Internalizing Decision Logic from Executable Guidelines into Large Language Models for Clinical Reasoning
ARXIV CS.AI
AGORA: Adapter-Grounded Observation-Action Retention for Inference-Free Prompt Compression in LLM Agents
ARXIV CS.AI
FAST-GOAL: Fast and Efficient Global-local Object Alignment Learning
ARXIV CS.AI
Tail-Aware HiFloat4: W4A4 Post-Training Quantization for Wan2.2
ARXIV CS.AI
UnityMAS-O: A General RL Optimization Framework for LLM-Based Multi-Agent Systems
ARXIV CS.AI
Completion vs Optimality: Policy Gradient in Long-Horizon Cumulative-Damage Problems
ARXIV CS.AI
MemFail: Stress-Testing Failure Modes of LLM Memory Systems
ARXIV CS.AI
Mind the Tool Failures: Achieving Synergistic Tool Gains for Medical Agents
ARXIV CS.AI
Towards Feedback-to-Plan Decisions for Self-Evolving LLM Agents in CUDA Kernel Generation
ARXIV CS.AI