Castillo_Henry_903002104 / eval_test.sh
henrycastillo's picture
add everything but lm eval harness
c3b20da verified
uv run hellaswag.py logs DEBUG 100