12 1

bzq

s70049

AI & ML interests

None yet

Recent Activity

upvoted a collection about 19 hours ago

SenseNova-U1

upvoted a paper about 22 hours ago

SenseNova-U1: Unifying Multimodal Understanding and Generation with NEO-unify Architecture

upvoted a paper 20 days ago

OpenMobile: Building Open Mobile Agents with Task and Trajectory Synthesis

View all activity

Organizations

None yet

upvoted a collection about 19 hours ago

SenseNova-U1

Collection

SenseNova-U1: Unifying Multimodal Understanding and Generation with NEO-Unify Architecture • 7 items • Updated about 22 hours ago • 58

upvoted a paper about 22 hours ago

SenseNova-U1: Unifying Multimodal Understanding and Generation with NEO-unify Architecture

Paper • 2605.12500 • Published 2 days ago • 110

upvoted a paper 20 days ago

OpenMobile: Building Open Mobile Agents with Task and Trajectory Synthesis

Paper • 2604.15093 • Published 28 days ago • 28

upvoted a paper about 2 months ago

EVA: Efficient Reinforcement Learning for End-to-End Video Agent

Paper • 2603.22918 • Published Mar 24 • 44

upvoted an article 2 months ago

Article

NEO-unify: Building Native Multimodal Unified Models End to End

sensenova

•

Mar 5

• 159

upvoted 3 papers 7 months ago

From Pixels to Words -- Towards Native Vision-Language Primitives at Scale

Paper • 2510.14979 • Published Oct 16, 2025 • 69

InteractiveOmni: A Unified Omni-modal Model for Audio-Visual Multi-turn Dialogue

Paper • 2510.13747 • Published Oct 15, 2025 • 32

CVD-STORM: Cross-View Video Diffusion with Spatial-Temporal Reconstruction Model for Autonomous Driving

Paper • 2510.07944 • Published Oct 9, 2025 • 25

upvoted a paper 8 months ago

ELV-Halluc: Benchmarking Semantic Aggregation Hallucinations in Long Video Understanding

Paper • 2508.21496 • Published Aug 29, 2025 • 55

upvoted 3 papers about 1 year ago

VisuLogic: A Benchmark for Evaluating Visual Reasoning in Multi-modal Large Language Models

Paper • 2504.15279 • Published Apr 21, 2025 • 78

InternVL3: Exploring Advanced Training and Test-Time Recipes for Open-Source Multimodal Models

Paper • 2504.10479 • Published Apr 14, 2025 • 309

Dita: Scaling Diffusion Transformer for Generalist Vision-Language-Action Policy

Paper • 2503.19757 • Published Mar 25, 2025 • 51

bzq

AI & ML interests

Recent Activity

Organizations

s70049's activity

NEO-unify: Building Native Multimodal Unified Models End to End