5 33 46

Xiao Liang

MasterVito

AI & ML interests

None yet

Recent Activity

upvoted a paper about 12 hours ago

ClawEnvKit: Automatic Environment Generation for Claw-Like Agents

upvoted a paper about 1 month ago

OpenVLThinkerV2: A Generalist Multimodal Reasoning Model for Multi-domain Visual Tasks

upvoted a paper about 1 month ago

SKILL0: In-Context Agentic Reinforcement Learning for Skill Internalization

View all activity

Organizations

upvoted a paper about 12 hours ago

ClawEnvKit: Automatic Environment Generation for Claw-Like Agents

Paper • 2604.18543 • Published 23 days ago • 28

upvoted 2 papers about 1 month ago

OpenVLThinkerV2: A Generalist Multimodal Reasoning Model for Multi-domain Visual Tasks

Paper • 2604.08539 • Published Apr 9 • 49

SKILL0: In-Context Agentic Reinforcement Learning for Skill Internalization

Paper • 2604.02268 • Published Apr 2 • 101

upvoted 2 papers about 2 months ago

HopChain: Multi-Hop Data Synthesis for Generalizable Vision-Language Reasoning

Paper • 2603.17024 • Published Mar 17 • 109

When AI Navigates the Fog of War

Paper • 2603.16642 • Published Mar 17 • 31

upvoted an article 3 months ago

Article

DenseR: Dense Rewards For Free in LLM Reasoning

hbXNov

•

Feb 18

• 21

upvoted 3 papers 3 months ago

Improving Data and Reward Design for Scientific Reasoning in Large Language Models

Paper • 2602.08321 • Published Feb 9 • 43

MSign: An Optimizer Preventing Training Instability in Large Language Models via Stable Rank Restoration

Paper • 2602.01734 • Published Feb 2 • 32

Training LLMs for Divide-and-Conquer Reasoning Elevates Test-Time Scalability

Paper • 2602.02477 • Published Feb 2 • 11

upvoted 2 papers 4 months ago

Locate, Steer, and Improve: A Practical Survey of Actionable Mechanistic Interpretability in Large Language Models

Paper • 2601.14004 • Published Jan 20 • 48

Dream-VL & Dream-VLA: Open Vision-Language and Vision-Language-Action Models with Diffusion Language Model Backbone

Paper • 2512.22615 • Published Dec 27, 2025 • 50

upvoted 3 papers 5 months ago

DAComp: Benchmarking Data Agents across the Full Data Intelligence Lifecycle

Paper • 2512.04324 • Published Dec 3, 2025 • 159

Gold-Medal-Level Olympiad Geometry Solving with Efficient Heuristic Auxiliary Constructions

Paper • 2512.00097 • Published Nov 27, 2025 • 3

Stabilizing Reinforcement Learning with LLMs: Formulation and Practices

Paper • 2512.01374 • Published Dec 1, 2025 • 106

upvoted a paper 6 months ago

JanusCoder: Towards a Foundational Visual-Programmatic Interface for Code Intelligence

Paper • 2510.23538 • Published Oct 27, 2025 • 98

upvoted 2 papers 7 months ago

Deep Self-Evolving Reasoning

Paper • 2510.17498 • Published Oct 20, 2025 • 12

Beyond the Exploration-Exploitation Trade-off: A Hidden State Approach for LLM Reasoning in RLVR

Paper • 2509.23808 • Published Sep 28, 2025 • 47

upvoted a paper 8 months ago

ScaleCUA: Scaling Open-Source Computer Use Agents with Cross-Platform Data

Paper • 2509.15221 • Published Sep 18, 2025 • 111

upvoted a collection 8 months ago

Qwen3

Collection

84 items • Updated Dec 31, 2025 • 1.78k

upvoted a paper 8 months ago

The Landscape of Agentic Reinforcement Learning for LLMs: A Survey

Paper • 2509.02547 • Published Sep 2, 2025 • 238

Xiao Liang

AI & ML interests

Recent Activity

Organizations

MasterVito's activity

DenseR: Dense Rewards For Free in LLM Reasoning