Weak-Driven Learning: How Weak Agents make Strong Agents Stronger Paper • 2602.08222 • Published 14 days ago • 263
A-RAG: Scaling Agentic Retrieval-Augmented Generation via Hierarchical Retrieval Interfaces Paper • 2602.03442 • Published 20 days ago • 19
PaperBanana: Automating Academic Illustration for AI Scientists Paper • 2601.23265 • Published 24 days ago • 195
Native Parallel Reasoner: Reasoning in Parallelism via Self-Distilled Reinforcement Learning Paper • 2512.07461 • Published Dec 8, 2025 • 78
Stabilizing Reinforcement Learning with LLMs: Formulation and Practices Paper • 2512.01374 • Published Dec 1, 2025 • 105
THINKSAFE: Self-Generated Safety Alignment for Reasoning Models Paper • 2601.23143 • Published 24 days ago • 38
Self-Improving Pretraining: using post-trained models to pretrain better models Paper • 2601.21343 • Published 25 days ago • 17
Entropy-Adaptive Fine-Tuning: Resolving Confident Conflicts to Mitigate Forgetting Paper • 2601.02151 • Published Jan 5 • 109
ArenaRL: Scaling RL for Open-Ended Agents via Tournament-based Relative Ranking Paper • 2601.06487 • Published Jan 10 • 53
Nested Learning: The Illusion of Deep Learning Architectures Paper • 2512.24695 • Published Dec 31, 2025 • 44
Chaining the Evidence: Robust Reinforcement Learning for Deep Search Agents with Citation-Aware Rubric Rewards Paper • 2601.06021 • Published Jan 9 • 47
Rewarding the Rare: Uniqueness-Aware RL for Creative Problem Solving in LLMs Paper • 2601.08763 • Published Jan 13 • 148
Multiplex Thinking: Reasoning via Token-wise Branch-and-Merge Paper • 2601.08808 • Published Jan 13 • 39