Kimi-K2-Instruct / .eval_results /swe_bench_pro.yaml
bigeagle's picture
Add SWE-Bench Pro evaluation results (#65)
4bbe370
raw
history blame contribute delete
221 Bytes
- dataset:
id: ScaleAI/SWE-bench_Pro
task_id: SWE_Bench_Pro
value: 27.67
source:
url: https://scale.com/leaderboard/swe_bench_pro_public
name: SWE-Bench Pro official evaluation results
user: nielsr