Running 158 The ultimate guide to RL environments: building and scaling them in the LLM era 📝 158 Building and scaling RL environments for LLM training
nvidia/Nemotron-3-Nano-Omni-30B-A3B-Reasoning-BF16 Any-to-Any • 33B • Updated 8 days ago • 257k • 291
Running Featured 84 Distilling 100B+ Models 40x Faster with TRL 📝 84 TRL distillation for 100B+ teachers, 40x faster