Transforming Science with Large Language Models: A Survey on AI-assisted Scientific Discovery, Experimentation, Content Generation, and Evaluation Paper • 2502.05151 • Published Feb 7, 2025
DocMMIR: A Framework for Document Multi-modal Information Retrieval Paper • 2505.19312 • Published May 25, 2025 • 1
OmniVideoBench: Towards Audio-Visual Understanding Evaluation for Omni MLLMs Paper • 2510.10689 • Published Oct 12, 2025 • 47
From Code Foundation Models to Agents and Applications: A Practical Guide to Code Intelligence Paper • 2511.18538 • Published Nov 23, 2025 • 304
AutoMV: An Automatic Multi-Agent System for Music Video Generation Paper • 2512.12196 • Published Dec 13, 2025 • 7
Context as a Tool: Context Management for Long-Horizon SWE-Agents Paper • 2512.22087 • Published Dec 26, 2025 • 3
Encyclo-K: Evaluating LLMs with Dynamically Composed Knowledge Statements Paper • 2512.24867 • Published Dec 31, 2025 • 1
Dynamic Large Concept Models: Latent Reasoning in an Adaptive Semantic Space Paper • 2512.24617 • Published Dec 31, 2025 • 66
Large-Scale Terminal Agentic Trajectory Generation from Dockerized Environments Paper • 2602.01244 • Published Feb 1 • 16
CMI-RewardBench: Evaluating Music Reward Models with Compositional Multimodal Instruction Paper • 2603.00610 • Published Feb 28 • 35
InCoder-32B: Code Foundation Model for Industrial Scenarios Paper • 2603.16790 • Published about 1 month ago • 308
Dynamics Within Latent Chain-of-Thought: An Empirical Study of Causal Structure Paper • 2602.08783 • Published Feb 9 • 1
InCoder-32B-Thinking: Industrial Code World Model for Thinking Paper • 2604.03144 • Published 14 days ago • 231
EvolveCoder: Evolving Test Cases via Adversarial Verification for Code Reinforcement Learning Paper • 2603.12698 • Published Mar 13 • 1
Nemotron-Cascade 2: Post-Training LLMs with Cascade RL and Multi-Domain On-Policy Distillation Paper • 2603.19220 • Published 29 days ago • 66
OpenResearcher: A Fully Open Pipeline for Long-Horizon Deep Research Trajectory Synthesis Paper • 2603.20278 • Published about 1 month ago • 94
Critique-Coder: Enhancing Coder Models by Critique Reinforcement Learning Paper • 2509.22824 • Published Sep 26, 2025 • 21
VideoScore2: Think before You Score in Generative Video Evaluation Paper • 2509.22799 • Published Sep 26, 2025 • 26