Zican Hu's picture

9 1

Zican Hu

huzican

·

[email protected]

AI & ML interests

None yet

Recent Activity

upvoted a paper 18 days ago

Reasoning over Boundaries: Enhancing Specification Alignment via Test-time Delibration

liked a dataset 24 days ago

Sylence/SSMR-Bench

upvoted a paper 28 days ago

WebExplorer: Explore and Evolve for Training Long-Horizon Web Agents

View all activity

Organizations

None yet

upvoted a paper 18 days ago

Reasoning over Boundaries: Enhancing Specification Alignment via Test-time Delibration

Paper • 2509.14760 • Published 19 days ago • 51

liked a dataset 24 days ago

Sylence/SSMR-Bench

Viewer • Updated about 1 month ago • 19.2k • 193 • 3

upvoted a paper 28 days ago

WebExplorer: Explore and Evolve for Training Long-Horizon Web Agents

Paper • 2509.06501 • Published 29 days ago • 77

upvoted a paper about 2 months ago

Speed Always Wins: A Survey on Efficient Architectures for Large Language Models

Paper • 2508.09834 • Published Aug 13 • 53

authored 2 papers 2 months ago

Divide and Conquer: Grounding LLMs as Efficient Decision-Making Agents via Offline Hierarchical Reinforcement Learning

Paper • 2505.19761 • Published May 26

Text-to-Decision Agent: Offline Meta-Reinforcement Learning from Natural Language Supervision

Paper • 2504.15046 • Published Apr 21

upvoted 2 papers 3 months ago

STITCH: Simultaneous Thinking and Talking with Chunked Reasoning for Spoken Language Models

Paper • 2507.15375 • Published Jul 21 • 26

IntFold: A Controllable Foundation Model for General and Specialized Biomolecular Structure Prediction

Paper • 2507.02025 • Published Jul 2 • 35

authored 2 papers 4 months ago

Attention-Guided Contrastive Role Representations for Multi-Agent Reinforcement Learning

Paper • 2312.04819 • Published Dec 8, 2023

Mixture-of-Experts Meets In-Context Reinforcement Learning

Paper • 2506.05426 • Published Jun 5 • 4

upvoted 3 papers 4 months ago

The Entropy Mechanism of Reinforcement Learning for Reasoning Language Models

Paper • 2505.22617 • Published May 28 • 130

CPGD: Toward Stable Rule-based Reinforcement Learning for Language Models

Paper • 2505.12504 • Published May 18 • 24

MM-PRM: Enhancing Multimodal Mathematical Reasoning with Scalable Step-Level Supervision

Paper • 2505.13427 • Published May 19 • 26

upvoted a paper 6 months ago

Learning to Reason under Off-Policy Guidance

Paper • 2504.14945 • Published Apr 21 • 88

authored a paper 6 months ago

Learning to Reason under Off-Policy Guidance

Paper • 2504.14945 • Published Apr 21 • 88