Deep Research in Physical Sciences: A Multi-Agent Framework and Comprehensive Benchmark Paper • 2606.18648 • Published 7 days ago • 12
PlanBench-XL: Evaluating Long-Horizon Planning of LLM Tool-Use Agents in Large-Scale Tool Ecosystems Paper • 2606.22388 • Published 3 days ago • 77
PerceptionDLM: Parallel Region Perception with Multimodal Diffusion Language Models Paper • 2606.19534 • Published 7 days ago • 55
FAPO: Fully Autonomous Prompt Optimization of Multi-Step LLM Pipelines Paper • 2606.19605 • Published 7 days ago • 10
Beyond Static Leaderboards: Predictive Validity for the Evaluation of LLM Agents Paper • 2606.19704 • Published 6 days ago • 39
S-Agent: Spatial Tool-Use Elicits Reasoning for Spatial Intelligence Paper • 2606.20515 • Published 6 days ago • 39
ViT-Up: Faithful Feature Upsampling for Vision Transformers Paper • 2606.14024 • Published 12 days ago • 8
SciOrch: Learning to Orchestrate Expert LLMs for Solving Frontier Multimodal Scientific Reasoning Tasks Paper • 2606.15872 • Published 9 days ago • 8
Native Active Perception as Reasoning for Omni-Modal Understanding Paper • 2606.19341 • Published 7 days ago • 16
From Trainee to Trainer: LLM-Designed Training Environment for RL with Multi-Agent Reasoning Paper • 2606.17682 • Published 8 days ago • 26
LoopCoder-v2: Only Loop Once for Efficient Test-Time Computation Scaling Paper • 2606.18023 • Published 8 days ago • 203
ActWorld: From Explorable to Interactive World Model via Action-Aware Memory Paper • 2606.17730 • Published 8 days ago • 8
Rethinking the Role of Efficient Attention in Hybrid Architectures Paper • 2606.15378 • Published 11 days ago • 17
MotionVLA: Vision-Language-Action Model for Humanoid Motion Paper • 2606.15142 • Published 11 days ago • 4