3 39 9

Beichen Zhang

BeichenZhang

AI & ML interests

None yet

Recent Activity

upvoted a paper about 1 month ago

ARM-Thinker: Reinforcing Multimodal Generative Reward Models with Agentic Tool Use and Visual Reasoning

upvoted a paper about 2 months ago

ViSAudio: End-to-End Video-Driven Binaural Spatial Audio Generation

upvoted a paper about 2 months ago

Think Visually, Reason Textually: Vision-Language Synergy in ARC

View all activity

Organizations

None yet

upvoted a paper about 1 month ago

ARM-Thinker: Reinforcing Multimodal Generative Reward Models with Agentic Tool Use and Visual Reasoning

Paper • 2512.05111 • Published Dec 4, 2025 • 47

upvoted 2 papers about 2 months ago

ViSAudio: End-to-End Video-Driven Binaural Spatial Audio Generation

Paper • 2512.03036 • Published Dec 2, 2025 • 21

Think Visually, Reason Textually: Vision-Language Synergy in ARC

Paper • 2511.15703 • Published Nov 19, 2025 • 8

commented a paper about 2 months ago

Think Visually, Reason Textually: Vision-Language Synergy in ARC

Paper • 2511.15703 • Published Nov 19, 2025 • 8 •

upvoted 2 papers 3 months ago

Spatial-SSRL: Enhancing Spatial Understanding via Self-Supervised Reinforcement Learning

Paper • 2510.27606 • Published Oct 31, 2025 • 28

STAR-Bench: Probing Deep Spatio-Temporal Reasoning as Audio 4D Intelligence

Paper • 2510.24693 • Published Oct 28, 2025 • 18

upvoted 3 papers 4 months ago

upvoted 2 papers 5 months ago

CODA: Coordinating the Cerebrum and Cerebellum for a Dual-Brain Computer Use Agent with Decoupled Reinforcement Learning

Paper • 2508.20096 • Published Aug 27, 2025 • 36

SEAgent: Self-Evolving Computer Use Agent with Autonomous Learning from Experience

Paper • 2508.04700 • Published Aug 6, 2025 • 52

upvoted 3 papers 6 months ago

Beyond Fixed: Variable-Length Denoising for Diffusion Large Language Models

Paper • 2508.00819 • Published Aug 1, 2025 • 62

SeC: Advancing Complex Video Object Segmentation via Progressive Concept Construction

Paper • 2507.15852 • Published Jul 21, 2025 • 38

OST-Bench: Evaluating the Capabilities of MLLMs in Online Spatio-temporal Scene Understanding

Paper • 2507.07984 • Published Jul 10, 2025 • 42

upvoted a paper 7 months ago

ScaleCap: Inference-Time Scalable Image Captioning via Dual-Modality Debiasing

Paper • 2506.19848 • Published Jun 24, 2025 • 26

liked a model 7 months ago

zer0int/LongCLIP-GmP-ViT-L-14

Zero-Shot Image Classification • 0.4B • Updated Jul 16, 2025 • 2.09k • 79

upvoted 2 papers 8 months ago

Video World Models with Long-term Spatial Memory

Paper • 2506.05284 • Published Jun 5, 2025 • 55

Visual Agentic Reinforcement Fine-Tuning

Paper • 2505.14246 • Published May 20, 2025 • 32

authored 2 papers 8 months ago

Long-CLIP: Unlocking the Long-Text Capability of CLIP

Paper • 2403.15378 • Published Mar 22, 2024 • 4

Qwen3 Technical Report

Paper • 2505.09388 • Published May 14, 2025 • 321

Beichen Zhang

AI & ML interests

Recent Activity

Organizations

BeichenZhang's activity