Shangziqi Zhao's picture

6 9

Shangziqi Zhao

zhaoshangziqi

·

AI & ML interests

None yet

Recent Activity

upvoted a paper 7 days ago

From f(x) and g(x) to f(g(x)): LLMs Learn New Skills in RL by Composing Old Ones

upvoted a paper 18 days ago

FlowRL: Matching Reward Distributions for LLM Reasoning

upvoted a paper 19 days ago

Towards a Unified View of Large Language Model Post-Training

View all activity

Organizations

None yet

upvoted a paper 7 days ago

From f(x) and g(x) to f(g(x)): LLMs Learn New Skills in RL by Composing Old Ones

Paper • 2509.25123 • Published 8 days ago • 17

upvoted a paper 18 days ago

FlowRL: Matching Reward Distributions for LLM Reasoning

Paper • 2509.15207 • Published 19 days ago • 104

upvoted 3 papers 19 days ago

Towards a Unified View of Large Language Model Post-Training

Paper • 2509.04419 • Published Sep 4 • 73

SimpleVLA-RL: Scaling VLA Training via Reinforcement Learning

Paper • 2509.09674 • Published 26 days ago • 76

A Survey of Reinforcement Learning for Large Reasoning Models

Paper • 2509.08827 • Published 27 days ago • 175

upvoted a paper 27 days ago

HiPhO: How Far Are (M)LLMs from Humans in the Latest High School Physics Olympiad Benchmark?

Paper • 2509.07894 • Published 28 days ago • 32

liked a dataset about 1 month ago

furonghuang-lab/PHTest

Viewer • Updated Sep 24, 2024 • 3.27k • 127 • 3

liked a dataset about 2 months ago

wentingzhao/one-million-instructions

Viewer • Updated Sep 16, 2023 • 2.33M • 55 • 7

liked 2 datasets 3 months ago

bench-llm/or-bench

Viewer • Updated Dec 19, 2024 • 82.3k • 1.72k • 15

ChilleD/CommonsenseQA

Viewer • Updated Jun 4, 2024 • 12.1k • 384 • 1

liked 3 datasets 5 months ago

meng-lab/AdaDecode-Llama-3.1-8B-Instruct-GSM8K

Viewer • Updated Sep 25, 2024 • 8.79k • 11 • 1

openai/gsm8k

Viewer • Updated Jan 4, 2024 • 17.6k • 389k • 890

fql/qwq_long_cot_math_gsm_v1

Viewer • Updated Dec 29, 2024 • 10.3k • 9 • 1

liked 2 datasets 6 months ago

amphora/QwQ-LongCoT-130K

Viewer • Updated Dec 22, 2024 • 133k • 103 • 150

qingy2024/QwQ-LongCoT-Verified-130K

Viewer • Updated Dec 19, 2024 • 467k • 119 • 31