Beyond the Current Observation: Evaluating Multimodal Large Language Models in Controllable Non-Markov Games Paper • 2606.19338 • Published 17 days ago • 49
COLLEAGUE.SKILL: Automated AI Skill Generation via Expert Knowledge Distillation Paper • 2605.31264 • Published May 29 • 123
Auto Cherry-Picker: Learning from High-quality Generative Data Driven by Language Paper • 2406.20085 • Published Jun 28, 2024 • 13
DataChef: Cooking Up Optimal Data Recipes for LLM Adaptation via Reinforcement Learning Paper • 2602.11089 • Published Feb 11 • 18
TREX: Automating LLM Fine-tuning via Agent-Driven Tree-based Exploration Paper • 2604.14116 • Published Apr 15 • 13
TREX: Automating LLM Fine-tuning via Agent-Driven Tree-based Exploration Paper • 2604.14116 • Published Apr 15 • 13
Kernel-Smith: A Unified Recipe for Evolutionary Kernel Optimization Paper • 2603.28342 • Published Mar 30 • 26
Intern-S1-Pro: Scientific Multimodal Foundation Model at Trillion Scale Paper • 2603.25040 • Published Mar 26 • 134
InCoder-32B: Code Foundation Model for Industrial Scenarios Paper • 2603.16790 • Published Mar 17 • 312
CUDA Agent: Large-Scale Agentic RL for High-Performance CUDA Kernel Generation Paper • 2602.24286 • Published Feb 27 • 100
DataChef: Cooking Up Optimal Data Recipes for LLM Adaptation via Reinforcement Learning Paper • 2602.11089 • Published Feb 11 • 18
DataChef: Cooking Up Optimal Data Recipes for LLM Adaptation via Reinforcement Learning Paper • 2602.11089 • Published Feb 11 • 18
TL-GRPO: Turn-Level RL for Reasoning-Guided Iterative Optimization Paper • 2601.16480 • Published Jan 23 • 50