Zesen Cheng's picture

Zesen Cheng

ClownRat

·

https://clownrat6.github.io/

AI & ML interests

multi-modal foundation model; Segmentation, Detection, and Tracking;

Recent Activity

liked a dataset 7 days ago

OpenGVLab/VideoChat-Flash-Training-Data

liked a Space 15 days ago

lixin4ever/VideoRefer-VideoLLaMA3

upvoted a paper about 1 month ago

Lingshu: A Generalist Foundation Model for Unified Multimodal Medical Understanding and Reasoning

View all activity

Organizations

Collections 1

Papers 15

arxiv:2503.14428

arxiv:2502.13923

arxiv:2501.13106

arxiv:2501.00599

models 5

ClownRat/VideoLLaMA2.1-7B-16F

Text Generation • 8B • Updated Jan 6 • 2

ClownRat/resnet-50-torchvision

0.0B • Updated Dec 25, 2024 • 2

ClownRat/mask2former-resnet-50-coco-instance

0.0B • Updated Dec 25, 2024 • 5

ClownRat/resnet-101-torchvision

0.0B • Updated Dec 23, 2024 • 2

ClownRat/mask2former-resnet-101-coco-instance

0.1B • Updated Dec 17, 2024 • 4

datasets 2

ClownRat/YoutubeVIS-2019

Updated Jan 26 • 6

ClownRat/COCO2017-Instance

Viewer • Updated Dec 11, 2024 • 123k • 16 • 1