3 3

Martin Ziqiao Ma

marstin

http://www.ziqiaoma.com/

AI & ML interests

Mechanistic Alignment & Grounding for Interactive Cognition (aka MAGIC)

Recent Activity

authored a paper 1 day ago

Can Vision Language Models Infer Human Gaze Direction? A Controlled Study

authored a paper 1 day ago

4D-LRM: Large Space-Time Reconstruction Model From and To Any View at Any Time

updated a dataset 7 days ago

marstin/4D-LRM-Stuff

View all activity

Organizations

authored 2 papers 1 day ago

Can Vision Language Models Infer Human Gaze Direction? A Controlled Study

Paper • 2506.05412 • Published 22 days ago • 4

4D-LRM: Large Space-Time Reconstruction Model From and To Any View at Any Time

Paper • 2506.18890 • Published 3 days ago • 4

authored 2 papers 2 months ago

VEGGIE: Instructional Editing and Reasoning of Video Concepts with Grounded Generation

Paper • 2503.14350 • Published Mar 18

Vision-Language Models Are Not Pragmatically Competent in Referring Expression Generation

Paper • 2504.16060 • Published Apr 22

authored 4 papers 4 months ago

DriVLMe: Enhancing LLM-based Autonomous Driving Agents with Embodied and Social Experiences

Paper • 2406.03008 • Published Jun 5, 2024

Vision-and-Language Navigation Today and Tomorrow: A Survey in the Era of Foundation Models

Paper • 2407.07035 • Published Jul 9, 2024

Humanity's Last Exam

Paper • 2501.14249 • Published Jan 24 • 76

Training Turn-by-Turn Verifiers for Dialogue Tutoring Agents: The Curious Case of LLMs as Your Coding Tutors

Paper • 2502.13311 • Published Feb 18 • 1

authored 2 papers 11 months ago

Towards Bidirectional Human-AI Alignment: A Systematic Review for Clarifications, Framework, and Future Directions

Paper • 2406.09264 • Published Jun 13, 2024 • 2

Multi-Object Hallucination in Vision-Language Models

Paper • 2407.06192 • Published Jul 8, 2024 • 12

authored 8 papers over 1 year ago

Towards Collaborative Plan Acquisition through Theory of Mind Modeling in Situated Dialogue

Paper • 2305.11271 • Published May 18, 2023

GROUNDHOG: Grounding Large Language Models to Holistic Segmentation

Paper • 2402.16846 • Published Feb 26, 2024

Inversion-Free Image Editing with Natural Language

Paper • 2312.04965 • Published Dec 7, 2023 • 2

DOROTHIE: Spoken Dialogue for Handling Unexpected Situations in Interactive Autonomous Driving Agents

Paper • 2210.12511 • Published Oct 22, 2022

World-to-Words: Grounded Open Vocabulary Acquisition through Fast Mapping in Vision-Language Models

Paper • 2306.08685 • Published Jun 14, 2023 • 1

DANLI: Deliberative Agent for Following Natural Language Instructions

Paper • 2210.12485 • Published Oct 22, 2022

Towards A Holistic Landscape of Situated Theory of Mind in Large Language Models

Paper • 2310.19619 • Published Oct 30, 2023

CycleNet: Rethinking Cycle Consistency in Text-Guided Diffusion for Image Manipulation

Paper • 2310.13165 • Published Oct 19, 2023

Martin Ziqiao Ma

AI & ML interests

Recent Activity

Organizations

marstin's activity