Hugging Face
Models
Datasets
Spaces
Posts
Docs
Enterprise
Pricing
Log In
Sign Up
1323
161
248
Merve Noyan
merve
Follow
LuuNgoc2k2's profile picture
Sure77's profile picture
JackZ94's profile picture
6109 followers
Ā·
226 following
https://github.com/merveenoyan/smol-vision
mervenoyann
merveenoyan
merve.bsky.social
AI & ML interests
VLMs, vision & co
Recent Activity
posted
an
update
about 16 hours ago
Everything that happened this week in open AI, a recap š¤ https://huggingface.co/collections/merve/jan-17-releases-678a673a9de4a4675f215bf5 š Multimodal - MiniCPM-o 2.6 is a new sota any-to-any model by OpenBMB (vision, speech and text!) - VideoChat-Flash-Qwen2.5-2B is new video multimodal models by OpenGVLab that come in sizes 2B & 7B in resolutions 224 & 448 - ByteDance released larger SA2VA that comes in 26B parameters - Dataset: VRC-Bench is a new diverse benchmark for multimodal LLM reasoning performance š¬ LLMs - MiniMax-Text-01 is a new huge language model (456B passive 45.9B active params) by MiniMaxAI with context length of 4M tokens š¤Æ - Dataset: Sky-T1-data-17k is a diverse dataset used to train Sky-T1-32B - kyutai released Helium-1-Preview-2B is a new small multilingual LM - Wayfarer-12B is a new LLM able to write D&D š§š»āāļø - ReaderLM-v2 is a new HTML parsing model by Jina AI - Dria released, Dria-Agent-a-3B, new agentic coding model (Pythonic function calling) based on Qwen2.5 Coder - Unsloth released Phi-4, faster and memory efficient Llama 3.3 š¼ļø Vision - MatchAnything is a new foundation model for matching - FitDit is a high-fidelity VTON model based on DiT architecture š£ļø Audio - OuteTTS-0.3-1B is a new multilingual text-to-speech model with voice cloning and emotion control capabilities š Retrieval - lightblue released a new reranker based on Qwen2.5 LB-reranker-0.5B-v1.0 that can handle 95+ languages - cde-small-v2 is a new sota small retrieval model by @jxm
updated
a collection
about 16 hours ago
Jan 17 Releases āļø
updated
a collection
about 16 hours ago
Jan 17 Releases āļø
View all activity
Articles
Introducing smolagents: simple agents that write actions in code.
18 days ago
ā¢
505
Welcome PaliGemma 2 ā New vision language models by Google
Dec 5, 2024
ā¢
127
SmolVLM - small yet mighty Vision Language Model
Nov 26, 2024
ā¢
156
Llama can now see and run on your device - welcome Llama 3.2
Sep 25, 2024
ā¢
181
Preference Optimization for Vision Language Models
Jul 10, 2024
ā¢
55
Fine-tuning Florence-2 - Microsoft's Cutting-edge Vision Language Models
Jun 24, 2024
ā¢
183
PaliGemma ā Google's Cutting-Edge Open Vision Language Model
May 14, 2024
ā¢
233
Vision Language Models Explained
Apr 11, 2024
ā¢
245
Introduction to Quantization cooked in š¤ with šš§āš³
Aug 25, 2023
ā¢
25
Deploy MusicGen in no time with Inference Endpoints
Aug 4, 2023
ā¢
4
Open-Source Text Generation & LLM Ecosystem at Hugging Face
Jul 17, 2023
ā¢
2
Jupyter X Hugging Face
Mar 23, 2023
ā¢
2
Using Machine Learning to Aid Survivors and Race through Time
Mar 3, 2023
ā¢
6
Introducing Skops
Aug 12, 2022
ā¢
1
Announcing the Hugging Face Fellowship Program
May 17, 2022
ā¢
6
Showcase Your Projects in Spaces using Gradio
Oct 5, 2021
ā¢
6
Hosting your Models and Datasets on Hugging Face Spaces using Streamlit
Oct 5, 2021
ā¢
3
Organizations
merve
's activity
All
Models
Datasets
Spaces
Papers
Collections
Community
Posts
Upvotes
Likes
liked
a Space
about 16 hours ago
Running
on
Zero
67
š¢
MatchAnything
liked
2 datasets
about 16 hours ago
omkarthawakar/VRC-Bench
Viewer
ā¢
Updated
5 days ago
ā¢
1k
ā¢
927
ā¢
8
microsoft/PEACE
Viewer
ā¢
Updated
8 days ago
ā¢
7.73k
ā¢
526
ā¢
11
liked
3 models
about 16 hours ago
ByteDance/Sa2VA-26B
Image-Text-to-Text
ā¢
Updated
4 days ago
ā¢
46
ā¢
8
lightblue/lb-reranker-0.5B-v1.0
Text Generation
ā¢
Updated
3 days ago
ā¢
550
ā¢
48
jxm/cde-small-v2
Feature Extraction
ā¢
Updated
1 day ago
ā¢
1.03k
ā¢
55
liked
2 models
about 17 hours ago
jinaai/ReaderLM-v2
Text Generation
ā¢
Updated
1 day ago
ā¢
1.6k
ā¢
218
MiniMaxAI/MiniMax-Text-01
Text Generation
ā¢
Updated
1 day ago
ā¢
1.53k
ā¢
372
liked
a model
5 days ago
finegrain/finegrain-box-segmenter
Mask Generation
ā¢
Updated
Sep 11, 2024
ā¢
4.22k
ā¢
100
liked
a Space
5 days ago
Running
on
Zero
149
š
Gaze Demo
Gaze detection using Moondream
liked
a model
5 days ago
stabilityai/stable-point-aware-3d
Image-to-3D
ā¢
Updated
1 day ago
ā¢
3.77k
ā¢
173
liked
3 Spaces
5 days ago
Running
11
š
YOLO11
Ultralytics YOLO11 Gradio Application for Testing
Running
13
š
timm Attention Visualization
Running
56
š
The timm Leaderboard
liked
a model
5 days ago
black-forest-labs/FLUX.1-Depth-dev-lora
Updated
Dec 11, 2024
ā¢
10.9k
ā¢
130
liked
a dataset
5 days ago
HuggingFaceTB/smoltalk
Viewer
ā¢
Updated
Nov 26, 2024
ā¢
2.2M
ā¢
6.41k
ā¢
283
liked
4 models
5 days ago
OuteAI/OuteTTS-0.1-350M
Text-to-Speech
ā¢
Updated
Nov 27, 2024
ā¢
4.93k
ā¢
298
PowerInfer/SmallThinker-3B-Preview
Text Generation
ā¢
Updated
2 days ago
ā¢
51.7k
ā¢
353
Yuanshi/OminiControl
Image-to-Image
ā¢
Updated
Dec 10, 2024
ā¢
5.48k
ā¢
111
zer0int/CLIP-GmP-ViT-L-14
Zero-Shot Image Classification
ā¢
Updated
Sep 23, 2024
ā¢
19.2k
ā¢
375
Load more