Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up
shawon 's Collections
CV
VLMs
llm-reasoning
Transformer
L&V Models
RAG

VLMs

updated 8 days ago
Upvote
-

  • PointArena: Probing Multimodal Grounding Through Language-Guided Pointing

    Paper • 2505.09990 • Published 10 days ago • 11

  • Style Customization of Text-to-Vector Generation with Image Diffusion Priors

    Paper • 2505.10558 • Published 10 days ago • 15

  • Exploring the Deep Fusion of Large Language Models and Diffusion Transformers for Text-to-Image Synthesis

    Paper • 2505.10046 • Published 10 days ago • 9

  • X-Sim: Cross-Embodiment Learning via Real-to-Sim-to-Real

    Paper • 2505.07096 • Published 13 days ago • 3
Upvote
-
  • Collection guide
  • Browse collections
Company
TOS Privacy About Jobs
Website
Models Datasets Spaces Pricing Docs