Tianjian Li's picture

Tianjian Li

dogtooth

·

https://tianjianl.github.io

tianjianl

AI & ML interests

None yet

Recent Activity

updated a dataset about 6 hours ago

dogtooth/rc-annotated-sft

published a dataset about 16 hours ago

dogtooth/rc-annotated-sft

updated a dataset 17 days ago

dogtooth/polaris_filtered_removed_all_correct

View all activity

Organizations

upvoted 4 papers 5 months ago

The Alignment Waltz: Jointly Training Agents to Collaborate for Safety

Paper • 2510.08240 • Published Oct 9, 2025 • 41

Hybrid Reinforcement: When Reward Is Sparse, It's Better to Be Dense

Paper • 2510.07242 • Published Oct 8, 2025 • 30

RESTRAIN: From Spurious Votes to Signals -- Self-Driven RL with Self-Penalization

Paper • 2510.02172 • Published Oct 2, 2025 • 7

The Flaw of Averages: Quantifying Uniformity of Performance on Benchmarks

Paper • 2509.25671 • Published Sep 30, 2025 • 6

upvoted a paper 6 months ago

Jointly Reinforcing Diversity and Quality in Language Model Generations

Paper • 2509.02534 • Published Sep 2, 2025 • 25

upvoted 2 papers 10 months ago

J1: Incentivizing Thinking in LLM-as-a-Judge via Reinforcement Learning

Paper • 2505.10320 • Published May 15, 2025 • 24

SIMPLEMIX: Frustratingly Simple Mixing of Off- and On-policy Data in Language Model Preference Learning

Paper • 2505.02363 • Published May 5, 2025 • 7

upvoted a paper 11 months ago

Certified Mitigation of Worst-Case LLM Copyright Infringement

Paper • 2504.16046 • Published Apr 22, 2025 • 13

upvoted a paper about 1 year ago

Compressed Chain of Thought: Efficient Reasoning Through Dense Representations

Paper • 2412.13171 • Published Dec 17, 2024 • 35

upvoted 3 papers over 1 year ago

Generative World Explorer

Paper • 2411.11844 • Published Nov 18, 2024 • 77

Controllable Safety Alignment: Inference-Time Adaptation to Diverse Safety Requirements

Paper • 2410.08968 • Published Oct 11, 2024 • 13

Programming Every Example: Lifting Pre-training Data Quality like Experts at Scale

Paper • 2409.17115 • Published Sep 25, 2024 • 64