Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up
PeterJinGo 's Collections
Search-R1-v0.3
Search-R1-v0.2
Search-R1

Search-R1-v0.3

updated Aug 12, 2025

RL with outcome reward + format reward. https://arxiv.org/abs/2505.15117

Upvote
3

  • PeterJinGo/SearchR1-nq_hotpotqa_train-qwen2.5-3b-em-ppo-v0.3

    3B • Updated May 21, 2025 • 50

  • PeterJinGo/SearchR1-nq_hotpotqa_train-qwen2.5-3b-it-em-ppo-v0.3

    3B • Updated May 21, 2025 • 1

  • PeterJinGo/SearchR1-nq_hotpotqa_train-qwen2.5-3b-em-grpo-v0.3

    3B • Updated May 21, 2025 • 147 • 1

  • PeterJinGo/SearchR1-nq_hotpotqa_train-qwen2.5-3b-it-em-grpo-v0.3

    3B • Updated May 21, 2025 • 36

  • PeterJinGo/SearchR1-nq_hotpotqa_train-qwen2.5-7b-em-ppo-v0.3

    8B • Updated May 21, 2025 • 28.6k • 1

  • PeterJinGo/SearchR1-nq_hotpotqa_train-qwen2.5-7b-em-grpo-v0.3

    8B • Updated May 21, 2025 • 2

  • PeterJinGo/SearchR1-nq_hotpotqa_train-qwen2.5-7b-it-em-grpo-v0.3

    8B • Updated May 21, 2025 • 22 • 1

  • PeterJinGo/SearchR1-nq_hotpotqa_train-qwen2.5-14b-em-ppo-v0.3

    15B • Updated May 2, 2025 • 1

  • PeterJinGo/SearchR1-nq_hotpotqa_train-qwen2.5-14b-em-grpo-v0.3

    15B • Updated May 2, 2025

  • PeterJinGo/SearchR1-nq_hotpotqa_train-qwen2.5-14b-it-em-grpo-v0.3

    15B • Updated May 2, 2025 • 26

  • PeterJinGo/SearchR1-nq_hotpotqa_train-qwen2.5-32b-em-grpo-v0.3

    33B • Updated May 10, 2025 • 131

  • PeterJinGo/LICENCE

    Viewer • Updated Aug 12, 2025 • 202 • 6
Upvote
3
  • Collection guide
  • Browse collections
Company
TOS Privacy About Careers
Website
Models Datasets Spaces Pricing Docs