Where do Large Vision-Language Models Look at when Answering Questions? Paper • 2503.13891 • Published 7 days ago • 5
CaKE: Circuit-aware Editing Enables Generalizable Knowledge Learners Paper • 2503.16356 • Published 4 days ago • 14
Aligning Multimodal LLM with Human Preference: A Survey Paper • 2503.14504 • Published 6 days ago • 20
Using Mechanistic Interpretability to Craft Adversarial Attacks against Large Language Models Paper • 2503.06269 • Published 16 days ago • 4
DreamRenderer: Taming Multi-Instance Attribute Control in Large-Scale Text-to-Image Models Paper • 2503.12885 • Published 8 days ago • 41
On the Limitations of Vision-Language Models in Understanding Image Transforms Paper • 2503.09837 • Published 12 days ago • 10
TruthPrInt: Mitigating LVLM Object Hallucination Via Latent Truthful-Guided Pre-Intervention Paper • 2503.10602 • Published 11 days ago • 4
Discovering Influential Neuron Path in Vision Transformers Paper • 2503.09046 • Published 13 days ago • 6
Block Diffusion: Interpolating Between Autoregressive and Diffusion Language Models Paper • 2503.09573 • Published 12 days ago • 61
Gemini Embedding: Generalizable Embeddings from Gemini Paper • 2503.07891 • Published 14 days ago • 33
What's in a Latent? Leveraging Diffusion Latent Space for Domain Generalization Paper • 2503.06698 • Published 15 days ago • 4
Words or Vision: Do Vision-Language Models Have Blind Faith in Text? Paper • 2503.02199 • Published 21 days ago • 8
Unleashing the Potential of Large Language Models for Text-to-Image Generation through Autoregressive Representation Alignment Paper • 2503.07334 • Published 14 days ago • 16
SEAP: Training-free Sparse Expert Activation Pruning Unlock the Brainpower of Large Language Models Paper • 2503.07605 • Published 14 days ago • 65
Feature-Level Insights into Artificial Text Detection with Sparse Autoencoders Paper • 2503.03601 • Published 19 days ago • 215