Victor Gallego's picture

Victor Gallego

vicgalle

·

https://github.com/vicgalle

AI & ML interests

Preference fine-tuning, alignment & synthetic data. Building LLMs in general!

Recent Activity

liked a dataset 5 days ago

common-pile/caselaw_access_project

liked a dataset 5 days ago

NousResearch/Hermes-3-Dataset

liked a model 9 days ago

moonshotai/Kimi-K2-Instruct

View all activity

Organizations

liked 2 datasets 5 days ago

common-pile/caselaw_access_project

Viewer • Updated Jun 6 • 5.52M • 2.22k • 133

NousResearch/Hermes-3-Dataset

Viewer • Updated 9 days ago • 959k • 2.19k • 171

liked 2 models 9 days ago

moonshotai/Kimi-K2-Instruct

Text Generation • Updated 2 days ago • 145k • • 1.54k

moonshotai/Kimi-K2-Base

Text Generation • Updated 7 days ago • 3.34k • 227

liked 2 datasets 9 days ago

PrimeIntellect/SYNTHETIC-2-RL

Viewer • Updated 9 days ago • 156k • 166 • 3

PrimeIntellect/SYNTHETIC-2

Viewer • Updated 9 days ago • 51.6k • 298 • 4

New activity in vicgalle/gpt2-alpaca 10 days ago

License Conflict: MIT vs CC BY-NC 4.0

#3 opened 22 days ago by

New activity in vicgalle/xlm-roberta-large-xnli-anli 10 days ago

License Conflict: MIT vs CC BY-NC 4.0

#3 opened 22 days ago by

liked a model 11 days ago

m-a-p/CriticLeanGPT-Qwen3-8B-RL

8B • Updated 10 days ago • 28 • 3

upvoted a paper 11 days ago

CriticLean: Critic-Guided Reinforcement Learning for Mathematical Formalization

Paper • 2507.06181 • Published 12 days ago • 39

liked a dataset 15 days ago

interstellarninja/json-mode-reasoning

Viewer • Updated 15 days ago • 20.5k • 218 • 4

liked a dataset 22 days ago

PKU-Alignment/DeceptionBench

Viewer • Updated May 27 • 180 • 440 • 1

New activity in vicgalle/CarbonBeagle-11B 25 days ago

License Compatibility

#4 opened 26 days ago by

New activity in vicgalle/Configurable-Hermes-3-Llama-3.1-8B 25 days ago

License incompatibility

#1 opened 25 days ago by

updated a model 25 days ago

vicgalle/Configurable-Hermes-3-Llama-3.1-8B

Text Generation • 8B • Updated 25 days ago • 48 • • 6

upvoted a paper 25 days ago

Robust Reward Modeling via Causal Rubrics

Paper • 2506.16507 • Published about 1 month ago • 8

updated a model 26 days ago

vicgalle/CarbonBeagle-11B

Text Generation • 11B • Updated 26 days ago • 10k • • 9

upvoted a collection 27 days ago

Configurable Preference Tuning ⚙️📝

CPT uses rubric-guided synthetic data and DPO to enable LLMs to dynamically adjust behavior (e.g., writing style) at inference with system prompts • 7 items • Updated Jun 17 • 1

New activity in vicgalle/configurable-preference-phi4 29 days ago

Add license, link to paper and code repository

#3 opened 29 days ago by

New activity in vicgalle/configurable-preference-rocinante-12b 29 days ago

Add Apache 2.0 License

#2 opened 29 days ago by