Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Posts
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up
xxyyy123 's Collections
LLM
Align
Dataset

LLM

updated Sep 25, 2024
Upvote
-

  • MoMa: Efficient Early-Fusion Pre-training with Mixture of Modality-Aware Experts

    Paper • 2407.21770 • Published Jul 31, 2024 • 23

  • VILA^2: VILA Augmented VILA

    Paper • 2407.17453 • Published Jul 24, 2024 • 42

  • The Synergy between Data and Multi-Modal Large Language Models: A Survey from Co-Development Perspective

    Paper • 2407.08583 • Published Jul 11, 2024 • 13

  • Vision language models are blind

    Paper • 2407.06581 • Published Jul 9, 2024 • 83

  • ColPali: Efficient Document Retrieval with Vision Language Models

    Paper • 2407.01449 • Published Jun 27, 2024 • 48

  • Multimodal Structured Generation: CVPR's 2nd MMFM Challenge Technical Report

    Paper • 2406.11403 • Published Jun 17, 2024 • 4

  • AIDC-AI/Ovis1.6-Gemma2-9B

    Image-Text-to-Text • Updated Feb 26 • 906 • 270
Upvote
-
  • Collection guide
  • Browse collections
Company
TOS Privacy About Jobs
Website
Models Datasets Spaces Pricing Docs