MM-graph

community

AI & ML interests

None defined yet.

shengyi-qian

authored a paper 3 months ago

DigiData: Training and Evaluating General-Purpose Mobile Control Agents

Paper • 2511.07413 • Published Nov 10, 2025 • 6

jingzhuu

updated a dataset 9 months ago

mm-graph-org/mm-graph

Viewer • Updated May 20, 2025 • 2M • 418 • 1

jingzhuu

in mm-graph-org/mm-graph 12 months ago

Add link to paper

#2 opened 12 months ago by

shengyi-qian

authored 2 papers over 1 year ago

Multi-Object Hallucination in Vision-Language Models

Paper • 2407.06192 • Published Jul 8, 2024 • 12

3D-GRAND: A Million-Scale Dataset for 3D-LLMs with Better Grounding and Less Hallucination

Paper • 2406.05132 • Published Jun 7, 2024 • 30

shengyi-qian

authored 3 papers over 2 years ago

LLM-Grounder: Open-Vocabulary 3D Visual Grounding with Large Language Model as an Agent

Paper • 2309.12311 • Published Sep 21, 2023 • 18

Understanding 3D Object Articulation in Internet Videos

Paper • 2203.16531 • Published Mar 30, 2022

Sound Localization from Motion: Jointly Learning Sound Direction and Camera Rotation

Paper • 2303.11329 • Published Mar 20, 2023 • 1

shengyi-qian

authored a paper almost 3 years ago

Understanding 3D Object Interaction from a Single Image

Paper • 2305.09664 • Published May 16, 2023 • 2