arxiv:2510.09426
DongJae Shin
ShinDJ
AI & ML interests
NLP, LLM, Vision-Langauge Model
Recent Activity
upvoted
an
article
3 days ago
We Got Claude to Fine-Tune an Open Source LLM
reacted
to
sergiopaniego's
post
with 🔥
3 days ago
NEW: @mistralai released a fantastic family of multimodal models, Ministral 3.
You can fine-tune them for free on Colab using TRL ⚡️, supporting both SFT and GRPO
Link to the notebooks:
- SFT: https://colab.research.google.com/github/huggingface/trl/blob/main/examples/notebooks/sft_ministral3_vl.ipynb
- GRPO: https://colab.research.google.com/github/huggingface/trl/blob/main/examples/notebooks/grpo_ministral3_vl.ipynb
- TRL and more examples: https://huggingface.co/docs/trl/index
reacted
to
sergiopaniego's
post
with 👍
12 days ago
Interested in RL training environments?
We just released a beginner-friendly walkthrough notebook!
Train a model to play Wordle using TRL + OpenEnv (TextArena) + GRPO + vLLM.
happy learning! 🌱
Notebook: https://github.com/huggingface/trl/blob/main/examples/notebooks/openenv_wordle_grpo.ipynb
OpenEnv guide in TRL: https://huggingface.co/docs/trl/main/en/openenv