Mirza Milan Farabi's picture

2

Mirza Milan Farabi

mmfarabi

·

AI & ML interests

AI & ML

Recent Activity

updated a Space about 2 months ago

mmfarabi/LeadGenAI

published a Space about 2 months ago

mmfarabi/LeadGenAI

reacted to davanstrien's post with 👍 about 2 months ago

Hacked together a way to log trl GRPO training completions to a 🤗 dataset repo. This allows you to: - Track rewards from multiple reward functions - Treat the completion and rewards from training as a "proper" dataset and do EDA - Share results for open science The implementation is super hacky, but I'm curious if people would find this useful. To push completions to the Hub, you just need two extra parameters: ``` log_completions=True log_completions_hub_repo='your-username/repo-name' ``` Example dataset: https://huggingface.co/datasets/davanstrien/test-logs Colab: https://colab.research.google.com/drive/1wzBFPVthRYYTp-mEYlznLg_e_0Za1M3g

View all activity

Organizations

None yet

mmfarabi's activity

updated a Space about 2 months ago

LeadGenAI

LinkedIn Lead Generation and Business Optimization App

published a Space about 2 months ago

LeadGenAI

LinkedIn Lead Generation and Business Optimization App

reacted to davanstrien's post with 👍 about 2 months ago

Post

2630

Hacked together a way to log trl GRPO training completions to a 🤗 dataset repo. This allows you to:

- Track rewards from multiple reward functions
- Treat the completion and rewards from training as a "proper" dataset and do EDA
- Share results for open science

The implementation is super hacky, but I'm curious if people would find this useful.

To push completions to the Hub, you just need two extra parameters:

log_completions=True
log_completions_hub_repo='your-username/repo-name'

Example dataset: davanstrien/test-logs
Colab: https://colab.research.google.com/drive/1wzBFPVthRYYTp-mEYlznLg_e_0Za1M3g

upvoted 2 collections about 2 months ago

Granite 3.1 Language Models

A series of language models with 128K context length trained by IBM licensed under Apache 2.0 license. • 9 items • Updated Feb 24 • 59

Granite Code Models

A series of code models trained by IBM licensed under Apache 2.0 license. We release both the base pretrained and instruct models. • 23 items • Updated Feb 24 • 191