Weiyun Wang's picture

Weiyun Wang

Weiyun1025

·

Weiyun1025

AI & ML interests

None yet

Recent Activity

upvoted a paper 1 day ago

Thinking with Images for Multimodal Reasoning: Foundations, Methods, and Future Frontiers

upvoted a paper 1 day ago

A Survey on Latent Reasoning

upvoted a paper 8 days ago

GLM-4.1V-Thinking: Towards Versatile Multimodal Reasoning with Scalable Reinforcement Learning

View all activity

Organizations

upvoted 2 papers 1 day ago

Thinking with Images for Multimodal Reasoning: Foundations, Methods, and Future Frontiers

Paper • 2506.23918 • Published 10 days ago • 76

A Survey on Latent Reasoning

Paper • 2507.06203 • Published 1 day ago • 63

upvoted a paper 8 days ago

GLM-4.1V-Thinking: Towards Versatile Multimodal Reasoning with Scalable Reinforcement Learning

Paper • 2507.01006 • Published 9 days ago • 176

upvoted a paper 22 days ago

LongLLaDA: Unlocking Long Context Capabilities in Diffusion LLMs

Paper • 2506.14429 • Published 23 days ago • 44

upvoted 3 papers 23 days ago

DeepResearch Bench: A Comprehensive Benchmark for Deep Research Agents

Paper • 2506.11763 • Published 27 days ago • 63

Scientists' First Exam: Probing Cognitive Abilities of MLLM via Perception, Understanding, and Reasoning

Paper • 2506.10521 • Published 28 days ago • 70

MiniMax-M1: Scaling Test-Time Compute Efficiently with Lightning Attention

Paper • 2506.13585 • Published 24 days ago • 252

upvoted a paper 26 days ago

Magistral

Paper • 2506.10910 • Published 28 days ago • 61

upvoted 2 papers 30 days ago

MiniCPM4: Ultra-Efficient LLMs on End Devices

Paper • 2506.07900 • Published about 1 month ago • 82

Reinforcement Pre-Training

Paper • 2506.08007 • Published about 1 month ago • 242

upvoted 5 papers about 1 month ago

Qwen3 Embedding: Advancing Text Embedding and Reranking Through Foundation Models

Paper • 2506.05176 • Published Jun 5 • 62

RoboRefer: Towards Spatial Referring with Reasoning in Vision-Language Models for Robotics

Paper • 2506.04308 • Published Jun 4 • 41

MiMo-VL Technical Report

Paper • 2506.03569 • Published Jun 4 • 74

Visual Embodied Brain: Let Multimodal Large Language Models See, Think, and Control in Spaces

Paper • 2506.00123 • Published May 30 • 34

ZeroGUI: Automating Online GUI Learning at Zero Human Cost

Paper • 2505.23762 • Published May 29 • 46

upvoted 5 papers about 2 months ago

Scaling Computer-Use Grounding via User Interface Decomposition and Synthesis

Paper • 2505.13227 • Published May 19 • 46

MiniMax-Speech: Intrinsic Zero-Shot Text-to-Speech with a Learnable Speaker Encoder

Paper • 2505.07916 • Published May 12 • 126

MathCoder-VL: Bridging Vision and Code for Enhanced Multimodal Mathematical Reasoning

Paper • 2505.10557 • Published May 15 • 47

BLIP3-o: A Family of Fully Open Unified Multimodal Models-Architecture, Training and Dataset

Paper • 2505.09568 • Published May 14 • 94

WorldPM: Scaling Human Preference Modeling

Paper • 2505.10527 • Published May 15 • 34