Zesen Cheng
ClownRat
AI & ML interests
multi-modal foundation model; Segmentation, Detection, and Tracking;
Recent Activity
liked
a dataset
7 days ago
OpenGVLab/VideoChat-Flash-Training-Data
liked
a Space
15 days ago
lixin4ever/VideoRefer-VideoLLaMA3
upvoted
a
paper
about 1 month ago
Lingshu: A Generalist Foundation Model for Unified Multimodal Medical
Understanding and Reasoning