Mirza Milan Farabi's picture
2

Mirza Milan Farabi

mmfarabi
ยท

AI & ML interests

AI & ML

Recent Activity

Organizations

None yet

mmfarabi's activity

reacted to davanstrien's post with ๐Ÿ‘ about 2 months ago
view post
Post
2630
Hacked together a way to log trl GRPO training completions to a ๐Ÿค— dataset repo. This allows you to:

- Track rewards from multiple reward functions
- Treat the completion and rewards from training as a "proper" dataset and do EDA
- Share results for open science

The implementation is super hacky, but I'm curious if people would find this useful.

To push completions to the Hub, you just need two extra parameters:

log_completions=True
log_completions_hub_repo='your-username/repo-name'

Example dataset: davanstrien/test-logs
Colab: https://colab.research.google.com/drive/1wzBFPVthRYYTp-mEYlznLg_e_0Za1M3g