Bill Yuchen Lin's picture

Bill Yuchen Lin

yuchenlin

·

https://yuchenlin.xyz

AI & ML interests

Research @allenai LLMs and Multimodality, Agents

Recent Activity

upvoted a paper about 1 month ago

TinyV: Reducing False Negatives in Verification Improves RL for LLM Reasoning

new activity 3 months ago

allenai/ZebraLogic:Update ZeroEval-main/result_dirs/zebra-grid.summary.json

updated a Space 3 months ago

WildEval/ZebraLogic

View all activity

Organizations

upvoted a paper about 1 month ago

TinyV: Reducing False Negatives in Verification Improves RL for LLM Reasoning

Paper • 2505.14625 • Published May 20 • 13

New activity in allenai/ZebraLogic 3 months ago

Update ZeroEval-main/result_dirs/zebra-grid.summary.json

#5 opened 3 months ago by

updated a Space 3 months ago

Zebra Logic Bench

Explore and evaluate Zebra Logic models

authored a paper 3 months ago

CrossWordBench: Evaluating the Reasoning Capabilities of LLMs and LVLMs with Controllable Puzzle Generation

Paper • 2504.00043 • Published Mar 30 • 9

liked a Space 3 months ago

Zebra Logic Bench

Render a leaderboard for model evaluation

updated a dataset 3 months ago

RLRM/Big-Math-RL-Verified-CT

Viewer • Updated Mar 14 • 251k • 7

published a dataset 4 months ago

RLRM/Big-Math-RL-Verified-CT

Viewer • Updated Mar 14 • 251k • 7

updated a dataset 4 months ago

RLRM/Big-Math-RL-Verified-CT

Viewer • Updated Mar 14 • 251k • 7

liked a model 4 months ago

deepseek-ai/DeepSeek-R1-Distill-Qwen-7B

Text Generation • Updated Feb 24 • 419k • • 663

liked a Space 4 months ago

VL RewardBench

Explore vision-language model benchmarks on a leaderboard

authored a paper 4 months ago

Small Models Struggle to Learn from Strong Reasoners

Paper • 2502.12143 • Published Feb 17 • 38

liked a Space 5 months ago

Zebra Logic Bench

Explore and evaluate Zebra Logic models

authored a paper 5 months ago

ZebraLogic: On the Scaling Limits of LLMs for Logical Reasoning

Paper • 2502.01100 • Published Feb 3 • 17

liked a dataset 5 months ago

WildEval/ZebraLogic

Viewer • Updated Feb 4 • 4.26k • 416 • 7

updated a Space 5 months ago

Zebra Logic Bench

Explore and evaluate Zebra Logic models

updated a dataset 5 months ago

WildEval/ZebraLogic

Viewer • Updated Feb 4 • 4.26k • 416 • 7

upvoted a paper 5 months ago

ZebraLogic: On the Scaling Limits of LLMs for Logical Reasoning

Paper • 2502.01100 • Published Feb 3 • 17

published a dataset 5 months ago

WildEval/ZebraLogic

Viewer • Updated Feb 4 • 4.26k • 416 • 7

updated a Space 5 months ago

Rebiber

upvoted a collection 6 months ago

Magpie Reasoning Datasets

Reasoning datasets built by Magpie and its friends! • 8 items • Updated Jan 27 • 10