Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up
Shehan Munasinghe's picture
2 11 2

Shehan Munasinghe

shehan97
Sarim-Hash's profile picture Saeid's profile picture 21world's profile picture
·
https://shehanmunasinghe.github.io/
  • shehan_u_e_m
  • shehanmunasinghe

AI & ML interests

Computer Vision, Multi-modal learning

Recent Activity

upvoted a paper 22 days ago
Sekai: A Video Dataset towards World Exploration
upvoted a paper about 1 month ago
CASS: Nvidia to AMD Transpilation with Data, Models, and Benchmark
upvoted a paper 4 months ago
LLMVoX: Autoregressive Streaming Text-to-Speech Model for Any LLM
View all activity

Organizations

Mohamed Bin Zayed University of Artificial Intelligence's profile picture

commented 2 papers 8 months ago

VideoGLaMM: A Large Multimodal Model for Pixel-Level Visual Grounding in Videos

Paper • 2411.04923 • Published Nov 7, 2024 • 24 •
3

VideoGLaMM: A Large Multimodal Model for Pixel-Level Visual Grounding in Videos

Paper • 2411.04923 • Published Nov 7, 2024 • 24 •
3
New activity in MBZUAI/swiftformer-xs over 1 year ago

Adding `safetensors` variant of this model

1
#1 opened almost 2 years ago by
SFconvertbot
Company
TOS Privacy About Jobs
Website
Models Datasets Spaces Pricing Docs