Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up
Fan Zhang's picture
9 3 1

Fan Zhang

ryanzhangfan
Leon2ta's profile picture AthulSathyapal's profile picture Chuong98vt's profile picture
·

AI & ML interests

None yet

Organizations

Beijing Academy of Artificial Intelligence's profile picture Emu3-community's profile picture

authored a paper 3 months ago

End-to-End Vision Tokenizer Tuning

Paper • 2505.10562 • Published May 15 • 22
authored 7 papers 11 months ago

CapsFusion: Rethinking Image-Text Data at Scale

Paper • 2310.20550 • Published Oct 31, 2023 • 27

Generative Multimodal Models are In-Context Learners

Paper • 2312.13286 • Published Dec 20, 2023 • 37

Generative Pretraining in Multimodality

Paper • 2307.05222 • Published Jul 11, 2023 • 22

EVA-CLIP-18B: Scaling CLIP to 18 Billion Parameters

Paper • 2402.04252 • Published Feb 6, 2024 • 29

DenseFusion-1M: Merging Vision Experts for Comprehensive Multimodal Perception

Paper • 2407.08303 • Published Jul 11, 2024 • 19

Diffusion Feedback Helps CLIP See Better

Paper • 2407.20171 • Published Jul 29, 2024 • 37

Emu3: Next-Token Prediction is All You Need

Paper • 2409.18869 • Published Sep 27, 2024 • 96
Company
TOS Privacy About Jobs
Website
Models Datasets Spaces Pricing Docs