Spaces:
Running
Running
Update submission.json with dpo
#39
by
robbiemu
- opened
triângulo:
hf jobs uv run \
--flavor a100-large \
--timeout 3h \
--secrets HF_TOKEN \
dpo_training.py
eval:
hf jobs uv run \
--flavor a10g-large \
--timeout 2h \
--with "lighteval[vllm]@git+https://github.com/huggingface/lighteval,emoji" \
--secrets HF_TOKEN \
lighteval vllm "model_name=robbiemu/smollm3-dpo-aligned" \
"lighteval|gsm8k|0|0,leaderboard|truthfulqa:mc|0|0,leaderboard|hellaswag|0|0,leaderboard|arc:challenge|0|0" \
--push-to-hub --results-org robbiemu
dpo_training.py is published with the model but is very, very similar to that from the exercise.