Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up
csuhan 's Collections
Tar
OneLLM

Tar

updated about 6 hours ago

Vision as a Dialect: Unifying Visual Understanding and Generation via Text-Aligned Representations

Upvote
-

  • Vision as a Dialect: Unifying Visual Understanding and Generation via Text-Aligned Representations

    Paper • 2506.18898 • Published 3 days ago • 23

  • Running on Zero
    1
    1

    Tar

    🚀

    Unified MLLM with Text-Aligned Representations


  • Running on A10G
    59
    59

    Tar

    🚀

    Unified MLLM with Text-Aligned Representations


  • csuhan/TA-Tok

    Updated 15 days ago

  • csuhan/tar_1.5B_pretrain_demo

    Updated 10 days ago
Upvote
-
  • Collection guide
  • Browse collections
Company
TOS Privacy About Jobs
Website
Models Datasets Spaces Pricing Docs