Tianjian Li
dogtooth
AI & ML interests
None yet
Recent Activity
upvoted
a
paper
about 2 months ago
J1: Incentivizing Thinking in LLM-as-a-Judge via Reinforcement Learning
authored
a paper
about 2 months ago
SIMPLEMIX: Frustratingly Simple Mixing of Off- and On-policy Data in
Language Model Preference Learning
upvoted
a
paper
about 2 months ago
SIMPLEMIX: Frustratingly Simple Mixing of Off- and On-policy Data in
Language Model Preference Learning