An Yan's picture

5 7 4

An Yan

zzxslp

·

zzxslp

AI & ML interests

Vision and Language, text generation

Organizations

authored 2 papers 9 months ago

Trust but Verify: Programmatic VLM Evaluation in the Wild

Paper • 2410.13121 • Published Oct 17, 2024 • 3

BLIP3-KALE: Knowledge Augmented Large-Scale Dense Captions

Paper • 2411.07461 • Published Nov 12, 2024 • 24

authored a paper 12 months ago

xGen-MM (BLIP-3): A Family of Open Large Multimodal Models

Paper • 2408.08872 • Published Aug 16, 2024 • 101

authored 5 papers over 1 year ago

List Items One by One: A New Data Source and Learning Paradigm for Multimodal LLMs

Paper • 2404.16375 • Published Apr 25, 2024 • 18

Learning Concise and Descriptive Attributes for Visual Recognition

Paper • 2308.03685 • Published Aug 7, 2023

MedEval: A Multi-Level, Multi-Task, and Multi-Domain Medical Benchmark for Language Model Evaluation

Paper • 2310.14088 • Published Oct 21, 2023 • 1

GPT-4V in Wonderland: Large Multimodal Models for Zero-Shot Smartphone GUI Navigation

Paper • 2311.07562 • Published Nov 13, 2023 • 15

Bridging Language and Items for Retrieval and Recommendation

Paper • 2403.03952 • Published Mar 6, 2024