37 18 70

Leo Tronchon PRO

Leyo

AI & ML interests

Multimodal, Self-Supervised Learning

Recent Activity

liked a dataset 3 months ago

wyu1/Leopard-Instruct

authored a paper 6 months ago

Building and better understanding vision-language models: insights and future directions

liked a dataset 7 months ago

HuggingFaceM4/Docmatix

View all activity

Organizations

Leyo's activity

liked a dataset 3 months ago

wyu1/Leopard-Instruct

Viewer • Updated Nov 8, 2024 • 1.03M • 343k • 56

authored a paper 6 months ago

Building and better understanding vision-language models: insights and future directions

Paper • 2408.12637 • Published Aug 22, 2024 • 125

liked a dataset 7 months ago

HuggingFaceM4/Docmatix

Viewer • Updated Aug 26, 2024 • 2.55M • 14.1k • 254

upvoted an article 7 months ago

Article

Docmatix - a huge dataset for Document Visual Question Answering

Jul 18, 2024

• 72

liked a dataset 8 months ago

tomg-group-umd/pixelprose

Viewer • Updated Jun 23, 2024 • 15.6M • 511 • 144

upvoted an article 8 months ago

Article

Introducing Idefics2: A Powerful 8B Vision-Language Model for the community

Apr 15, 2024

• 174

New activity in HuggingFaceM4/idefics2-8b 8 months ago

Multi-gpu fine-tuning

#30 opened 10 months ago by

matbee

liked a dataset 9 months ago

HuggingFaceFW/fineweb-edu

Viewer • Updated 21 days ago • 3.3B • 546k • 633

New activity in HuggingFaceM4/idefics2-8b 9 months ago

LoRA config used for training

#64 opened 9 months ago by

schwarzwalder

liked a model 9 months ago

meta-llama/Meta-Llama-3-8B

Text Generation • Updated Sep 27, 2024 • 535k • 6.04k

New activity in HuggingFaceM4/idefics2-8b 9 months ago

How is the image resolution expanded in a vision encoder?

#57 opened 9 months ago by

efei

liked a model 9 months ago

HuggingFaceM4/idefics2-8b-chatty-AWQ

Image-Text-to-Text • Updated May 6, 2024 • 13 • 5

upvoted an article 9 months ago

Article

Multimodal Augmentation for Documents: Recovering “Comprehension” in “Reading and Comprehension” task

•

May 16, 2024

• 17

liked a Space 10 months ago

627

Open VLM Leaderboard

🌎

VLMEvalKit Evaluation Results Collection

authored a paper 10 months ago

What matters when building vision-language models?

Paper • 2405.02246 • Published May 3, 2024 • 102

liked 2 models 10 months ago

HuggingFaceM4/idefics2-8b-base

Image-Text-to-Text • Updated Jul 30, 2024 • 1.15k • 27

HuggingFaceM4/idefics2-8b-chatty

Image-Text-to-Text • Updated Jul 30, 2024 • 2.31k • 92

upvoted a paper 10 months ago

What matters when building vision-language models?

Paper • 2405.02246 • Published May 3, 2024 • 102

liked a Space 10 months ago

168

IDEFICS2 Playground

🐨

New activity in HuggingFaceM4/idefics2_playground 10 months ago

Adding "test type" for redteaming notes.

#3 opened 10 months ago by

meg