5 6 11

Qing Li

li-qing

https://liqing.io

AI & ML interests

None yet

Recent Activity

upvoted a paper about 1 month ago

On the Generalization of SFT: A Reinforcement Learning Perspective with Reward Rectification

commented on a paper about 1 month ago

On the Generalization of SFT: A Reinforcement Learning Perspective with Reward Rectification

upvoted a paper about 2 months ago

Multi-modal Agent Tuning: Building a VLM-Driven Agent for Efficient Tool Usage

View all activity

Organizations

None yet

authored 2 papers 8 months ago

Multi-modal Agent Tuning: Building a VLM-Driven Agent for Efficient Tool Usage

Paper • 2412.15606 • Published Dec 20, 2024 • 2

LongViTU: Instruction Tuning for Long-Form Video Understanding

Paper • 2501.05037 • Published Jan 9 • 1

authored 13 papers about 1 year ago

Task-oriented Sequential Grounding in 3D Scenes

Paper • 2408.04034 • Published Aug 7, 2024 • 8

Bongard-OpenWorld: Few-Shot Reasoning for Free-form Visual Concepts in the Real World

Paper • 2310.10207 • Published Oct 16, 2023

An Embodied Generalist Agent in 3D World

Paper • 2311.12871 • Published Nov 18, 2023 • 8

SceneVerse: Scaling 3D Vision-Language Learning for Grounded Scene Understanding

Paper • 2401.09340 • Published Jan 17, 2024 • 22

3D-VisTA: Pre-trained Transformer for 3D Vision and Text Alignment

Paper • 2308.04352 • Published Aug 8, 2023

VideoAgent: A Memory-augmented Multimodal Agent for Video Understanding

Paper • 2403.11481 • Published Mar 18, 2024 • 13

Semantic Gaussians: Open-Vocabulary Scene Understanding with 3D Gaussian Splatting

Paper • 2403.15624 • Published Mar 22, 2024

SQA3D: Situated Question Answering in 3D Scenes

Paper • 2210.07474 • Published Oct 14, 2022

Neural-Symbolic Recursive Machine for Systematic Generalization

Paper • 2210.01603 • Published Oct 4, 2022

Perceive, Ground, Reason, and Act: A Benchmark for General-purpose Visual Representation

Paper • 2211.15402 • Published Nov 28, 2022

OmniJARVIS: Unified Vision-Language-Action Tokenization Enables Open-World Instruction Following Agents

Paper • 2407.00114 • Published Jun 27, 2024 • 13

UltraEdit: Instruction-based Fine-Grained Image Editing at Scale

Paper • 2407.05282 • Published Jul 7, 2024 • 15

FIRE: A Dataset for Feedback Integration and Refinement Evaluation of Multimodal Models

Paper • 2407.11522 • Published Jul 16, 2024 • 9

Qing Li

AI & ML interests

Recent Activity

Organizations

li-qing's activity