Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up
Shusheng Yang's picture
3 4 5

Shusheng Yang

ShushengYang
Temptressesxclusive's profile picture 21world's profile picture dongguanting's profile picture
·
https://shushengyang.com
  • shushengyang
  • vealocia

AI & ML interests

computer vision, vision language model

Organizations

NYU VisionX's profile picture Space's profile picture

authored 5 papers 7 months ago

Qwen Technical Report

Paper • 2309.16609 • Published Sep 28, 2023 • 36

ViTMatte: Boosting Image Matting with Pretrained Plain Vision Transformers

Paper • 2305.15272 • Published May 24, 2023

TouchStone: Evaluating Vision-Language Models by Language Models

Paper • 2308.16890 • Published Aug 31, 2023 • 1

Unleashing Vanilla Vision Transformer with Masked Image Modeling for Object Detection

Paper • 2204.02964 • Published Apr 6, 2022

Thinking in Space: How Multimodal Large Language Models See, Remember, and Recall Spaces

Paper • 2412.14171 • Published Dec 18, 2024 • 24
authored a paper 11 months ago

Qwen-VL: A Frontier Large Vision-Language Model with Versatile Abilities

Paper • 2308.12966 • Published Aug 24, 2023 • 8
authored a paper about 1 year ago

Cambrian-1: A Fully Open, Vision-Centric Exploration of Multimodal LLMs

Paper • 2406.16860 • Published Jun 24, 2024 • 61
Company
TOS Privacy About Jobs
Website
Models Datasets Spaces Pricing Docs