arxiv:2504.05288

LiveVQA: Live Visual Knowledge Seeking

Published on Apr 7

· Submitted by

shuaishuaicdp on Apr 8

Authors:

,

Yuyang Peng ,

Benlin Liu ,

,

Dongping Chen

Abstract

We introduce LiveVQA, an automatically collected dataset of latest visual knowledge from the Internet with synthesized VQA problems. LiveVQA consists of 3,602 single- and multi-hop visual questions from 6 news websites across 14 news categories, featuring high-quality image-text coherence and authentic information. Our evaluation across 15 MLLMs (e.g., GPT-4o, Gemma-3, and Qwen-2.5-VL family) demonstrates that stronger models perform better overall, with advanced visual reasoning capabilities proving crucial for complex multi-hop questions. Despite excellent performance on textual problems, models with tools like search engines still show significant gaps when addressing visual questions requiring latest visual knowledge, highlighting important areas for future research.

View arXiv page View PDF Add to collection

Community

Paper author Paper submitter 8 days ago

Our work is still in progress. If you have interest in this topic or like this paper, feel free to reach out!

7 days ago

Cool work! Consider citing livexiv!
https://arxiv.org/abs/2410.10783

·

Paper author 7 days ago

Thanks for suggestion! It is very related to our work. We will add this missing related work.

6 days ago

This is an automated message from the Librarian Bot. I found the following papers similar to this paper.

The following papers were recommended by the Semantic Scholar API

Please give a thumbs up to this comment if you found it helpful!

If you want recommendations for any Paper on Hugging Face checkout this Space

You can directly ask Librarian Bot for paper recommendations by tagging it in a comment: @librarian-bot recommend

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

Your need to confirm your account before you can post a new comment.

· Sign up or log in to comment

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2504.05288 in a model README.md to link it from this page.

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2504.05288 in a dataset README.md to link it from this page.

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2504.05288 in a Space README.md to link it from this page.

Collections including this paper 5