Victor Gallego's picture

Victor Gallego

vicgalle

·

https://github.com/vicgalle

AI & ML interests

Preference fine-tuning, alignment & synthetic data. Building LLMs in general!

Recent Activity

liked a dataset about 3 hours ago

SAIRfoundation/equational-theories-selected-problems

liked a dataset about 3 hours ago

SAIRfoundation/equational-theories-benchmark

liked a dataset 21 days ago

Solenopsisbot/real-slop

View all activity

Organizations

authored a paper 2 months ago

Distilling Feedback into Memory-as-a-Tool

Paper • 2601.05960 • Published Jan 9 • 3

submitted a paper to Daily Papers 2 months ago

Distilling Feedback into Memory-as-a-Tool

Paper • 2601.05960 • Published Jan 9 • 3

authored a paper 8 months ago

Specification Self-Correction: Mitigating In-Context Reward Hacking Through Test-Time Refinement

Paper • 2507.18742 • Published Jul 24, 2025 • 6

authored 2 papers almost 2 years ago

Merging Improves Self-Critique Against Jailbreak Attacks

Paper • 2406.07188 • Published Jun 11, 2024 • 4

Configurable Safety Tuning of Language Models with Synthetic Preference Data

Paper • 2404.00495 • Published Mar 30, 2024 • 2

authored a paper about 2 years ago

Refined Direct Preference Optimization with Synthetic Data for Behavioral Alignment of LLMs

Paper • 2402.08005 • Published Feb 12, 2024 • 1

authored 4 papers over 2 years ago

Distilled Self-Critique of LLMs with Synthetic Data: a Bayesian Perspective

Paper • 2312.01957 • Published Dec 4, 2023 • 1

Fast Adaptation with Bradley-Terry Preference Models in Text-To-Image Classification and Generation

Paper • 2308.07929 • Published Jul 15, 2023 • 1

Personalizing Text-to-Image Generation via Aesthetic Gradients

Paper • 2209.12330 • Published Sep 25, 2022 • 1

ZYN: Zero-Shot Reward Models with Yes-No Questions

Paper • 2308.06385 • Published Aug 11, 2023 • 1