1 214 734

Motoki Wu PRO

tokestermw

https://motoki.co

AI & ML interests

None yet

Recent Activity

upvoted a paper 1 day ago

The Landscape of Agentic Reinforcement Learning for LLMs: A Survey

upvoted a paper 7 days ago

R-4B: Incentivizing General-Purpose Auto-Thinking Capability in MLLMs via Bi-Mode Annealing and Reinforce Learning

upvoted a paper 13 days ago

Breaking the Exploration Bottleneck: Rubric-Scaffolded Reinforcement Learning for General LLM Reasoning

View all activity

Organizations

upvoted a paper 1 day ago

The Landscape of Agentic Reinforcement Learning for LLMs: A Survey

Paper • 2509.02547 • Published 6 days ago • 162

upvoted a paper 7 days ago

R-4B: Incentivizing General-Purpose Auto-Thinking Capability in MLLMs via Bi-Mode Annealing and Reinforce Learning

Paper • 2508.21113 • Published 11 days ago • 105

upvoted 2 papers 13 days ago

Breaking the Exploration Bottleneck: Rubric-Scaffolded Reinforcement Learning for General LLM Reasoning

Paper • 2508.16949 • Published 16 days ago • 22

AgentFly: Fine-tuning LLM Agents without Fine-tuning LLMs

Paper • 2508.16153 • Published 17 days ago • 133

liked a Space 13 days ago

534

Sheets

🗂

Create and enrich datasets using AI

liked a model 14 days ago

xai-org/grok-2

Updated 15 days ago • 4.82k • 926

liked a Space 19 days ago

188

Jupyter Agent 2

🏃

Run code and analyze data in a Jupyter notebook

liked a model 19 days ago

stepfun-ai/NextStep-1-Large-Edit

Image-to-Image • 15B • Updated 20 days ago • 736 • 47

upvoted a collection 19 days ago

NVIDIA Nemotron

Collection

Open, Production-ready Enterprise Models. Nvidia Open Model license. • 4 items • Updated 5 days ago • 56

liked 2 models 20 days ago

deepseek-ai/DeepSeek-V3.1-Base

Text Generation • 685B • Updated 13 days ago • 25.7k • 975

nvidia/NVIDIA-Nemotron-Nano-9B-v2

Text Generation • 9B • Updated 9 days ago • 88.7k • 321

upvoted a paper 20 days ago

SSRL: Self-Search Reinforcement Learning

Paper • 2508.10874 • Published 25 days ago • 91

liked a model 20 days ago

mistralai/Mistral-Small-3.2-24B-Instruct-2506

24B • Updated 18 days ago • 266k • 446

upvoted a paper 24 days ago

Pass@k Training for Adaptively Balancing Exploration and Exploitation of Large Reasoning Models

Paper • 2508.10751 • Published 25 days ago • 27

liked 5 models 24 days ago

Snowflake/snowflake-arctic-embed-l-v2.0

Snowflake/snowflake-arctic-embed-l

liked a model 25 days ago

jxm/gpt-oss-20b-base

Text Generation • 12B • Updated 19 days ago • 10.8k • 216