LeRobot Humanoid No-Arms — Velocity Tracking Policy (v16, iter 25000)

RL policy trained on the LeRobot bipedal humanoid (12 DoF, no upper body). Task: flat-ground velocity tracking.

Files

velocity_v16_iter25000/policy.onnx — Actor network (ONNX, 620 KiB)
- Input: obs [1, N] (flattened: base_ang_vel, projected_gravity, velocity_commands, joint_pos, joint_vel, last_action, with history_length=5)
- Output: actions [1, 12] (joint-position offsets, clipped to [-1, 1])
velocity_v16_iter25000/env.yaml — full env config (joint order, action scale, default init pose, all reward terms, all events)
velocity_v16_iter25000/agent.yaml — PPO hyperparameters (for reference)

Framework: WBC-AGILE (NVIDIA Isaac Lab)
Source task: Velocity-LeRobot-NoArms-v0 (adapted from Velocity-T1-v0)
~25,000 iterations, 6144 parallel envs
Reached ep_len mean ~280 steps (5.6 s) at time of export
Trained with sim counter-rotated init (URDF frame offset hack); expect sim-to-real gap until URDF is fixed

Policy falls after ~5 s on average (still training)
Init state uses a counter-rotated torso to compensate for URDF frame offset — real robot starts upright so observations at t=0 will not match training distribution
Domain randomization is moderate; expect sim-to-real issues

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support