2 23 29

Yiping Wang

ypwang61

https://ypwang61.github.io/

AI & ML interests

machine learning

Recent Activity

liked a dataset 12 days ago

siegelz/core-bench

upvoted a paper 26 days ago

Stabilizing Reinforcement Learning with LLMs: Formulation and Practices

upvoted a paper about 1 month ago

DR Tulu: Reinforcement Learning with Evolving Rubrics for Deep Research

View all activity

Organizations

None yet

Collections 1

Papers 1

arxiv:2504.20571

models 26

datasets 1

ypwang61/One-Shot-RLVR-Datasets

Viewer • Updated May 19 • 1.98k • 76 • 5

Yiping Wang

AI & ML interests

Recent Activity

Organizations

Collections 1

Reinforcement Learning for Reasoning in Large Language Models with One Training Example

ypwang61/One-Shot-RLVR-Qwen2.5-Math-1.5B-pi1

ypwang61/One-Shot-RLVR-Qwen2.5-Math-1.5B-pi13

ypwang61/One-Shot-RLVR-Qwen2.5-Math-1.5B-pi1209

Reinforcement Learning for Reasoning in Large Language Models with One Training Example

ypwang61/One-Shot-RLVR-Qwen2.5-Math-1.5B-pi1

ypwang61/One-Shot-RLVR-Qwen2.5-Math-1.5B-pi13

ypwang61/One-Shot-RLVR-Qwen2.5-Math-1.5B-pi1209

Papers 1

models 26

ypwang61/One-Shot-RLVR-Qwen2.5-Math-1.5B-pi1_pi2

ypwang61/One-Shot-RLVR-Qwen2.5-Math-1.5B-pi1209

ypwang61/One-Shot-RLVR-Qwen2.5-Math-1.5B-pi2

ypwang61/One-Shot-RLVR-Llama3.2-3B-Instruct-pi1_pi13

ypwang61/One-Shot-RLVR-Llama3.2-3B-Instruct-1.2k-dsr-sub

ypwang61/One-Shot-RLVR-Llama3.2-3B-Instruct-pi1

ypwang61/One-Shot-RLVR-R1-Distill-1.5B-1.2k-dsr-sub

ypwang61/One-Shot-RLVR-R1-Distill-1.5B-16-shot

ypwang61/One-Shot-RLVR-R1-Distill-1.5B-4-shot

ypwang61/One-Shot-RLVR-R1-Distill-1.5B-pi1

datasets 1

ypwang61/One-Shot-RLVR-Datasets

Yiping Wang

AI & ML interests

Recent Activity

Organizations

Collections 1

Papers 1

models 26 Sort: Recently updated

datasets 1

models 26