Fan Zhou's picture

Fan Zhou

koalazf99

·

https://koalazf99.github.io/

AI & ML interests

Deep Learning; Natural Language Processing; Foundation Models

Recent Activity

upvoted a paper about 12 hours ago

WebExplorer: Explore and Evolve for Training Long-Horizon Web Agents

authored a paper 2 months ago

OctoThinker: Mid-training Incentivizes Reinforcement Learning Scaling

new activity 2 months ago

OctoThinker/MegaMath-Web-Pro-Max:[bot] Conversion to Parquet

View all activity

Organizations

upvoted a paper about 12 hours ago

WebExplorer: Explore and Evolve for Training Long-Horizon Web Agents

Paper • 2509.06501 • Published 1 day ago • 51

authored a paper 2 months ago

OctoThinker: Mid-training Incentivizes Reinforcement Learning Scaling

Paper • 2506.20512 • Published Jun 25 • 46

New activity in OctoThinker/MegaMath-Web-Pro-Max 2 months ago

[bot] Conversion to Parquet

#3 opened 2 months ago by

parquet-converter

liked a dataset 3 months ago

OctoThinker/MegaMath-Web-Pro-Max

Viewer • Updated Jul 6 • 69.2M • 8.79k • 35

updated a collection 3 months ago

🐙 OctoThinker

Mid-training Incentivizes Reinforcement Learning Scaling • 18 items • Updated Jun 26 • 2

upvoted a paper 3 months ago

OctoThinker: Mid-training Incentivizes Reinforcement Learning Scaling

Paper • 2506.20512 • Published Jun 25 • 46

updated 2 collections 3 months ago

🐙 OctoThinker

Mid-training Incentivizes Reinforcement Learning Scaling • 18 items • Updated Jun 26 • 2

🧙 Guru

Revisiting Reinforcement Learning for LLM Reasoning from A Cross-Domain Perspective • 4 items • Updated Jun 20

authored a paper 3 months ago

Revisiting Reinforcement Learning for LLM Reasoning from A Cross-Domain Perspective

Paper • 2506.14965 • Published Jun 17 • 49

upvoted a paper 3 months ago

Revisiting Reinforcement Learning for LLM Reasoning from A Cross-Domain Perspective

Paper • 2506.14965 • Published Jun 17 • 49

liked 2 datasets 3 months ago

princeton-nlp/SWE-bench_Verified

Viewer • Updated Feb 18 • 500 • 1.28M • 199

LLM360/guru-RL-92k

Viewer • Updated 20 days ago • 91.9k • 1.08k • 20

upvoted 2 papers 3 months ago

Thinking with Generated Images

Paper • 2505.22525 • Published May 28 • 15

ScienceBoard: Evaluating Multimodal Autonomous Agents in Realistic Scientific Workflows

Paper • 2505.19897 • Published May 26 • 104

New activity in LLM360/MegaMath 4 months ago

Megamath-code parquets do not contain text column

#6 opened 4 months ago by

upvoted 2 papers 4 months ago

Learn to Reason Efficiently with Adaptive Length-based Reward Shaping

Paper • 2505.15612 • Published May 21 • 34

Efficient Agent Training for Computer Use

Paper • 2505.13909 • Published May 20 • 45