Attention-driven GUI Grounding: Leveraging Pretrained Multimodal Large Language Models without Fine-Tuning Paper • 2412.10840 • Published Dec 14, 2024 • 1
data-is-better-together/open-image-preferences-v1-results Viewer • Updated Dec 9, 2024 • 10k • 69 • 29
LlavaGuard Collection This collection contains the original repos of the LlavaGuard releases • 19 items • Updated May 12 • 7