MentraSuite: Post-Training Large Language Models for Mental Health Reasoning and Assessment Paper β’ 2512.09636 β’ Published Dec 10, 2025 β’ 26
MentraSuite: Post-Training Large Language Models for Mental Health Reasoning and Assessment Paper β’ 2512.09636 β’ Published Dec 10, 2025 β’ 26
DITING: A Multi-Agent Evaluation Framework for Benchmarking Web Novel Translation Paper β’ 2510.09116 β’ Published Oct 10, 2025 β’ 97
From Scores to Skills: A Cognitive Diagnosis Framework for Evaluating Financial Large Language Models Paper β’ 2508.13491 β’ Published Aug 19, 2025 β’ 59
FinAudio: A Benchmark for Audio Large Language Models in Financial Applications Paper β’ 2503.20990 β’ Published Mar 26, 2025 β’ 19