29 55 193

Théo Gigant

gigant

https://giganttheo.github.io/

AI & ML interests

multimodal

Recent Activity

upvoted an article about 1 month ago

nanoVLM: The simplest repository to train your VLM in pure PyTorch

updated a dataset about 2 months ago

gigant/tib-bench

upvoted an article about 2 months ago

Vision Language Models (Better, Faster, Stronger)

View all activity

Organizations

upvoted an article about 1 month ago

Article

nanoVLM: The simplest repository to train your VLM in pure PyTorch

and 6 others •

May 21

• 181

upvoted an article about 2 months ago

Article

Vision Language Models (Better, Faster, Stronger)

and 4 others •

May 12

• 469

upvoted 2 papers 3 months ago

Perception Encoder: The best visual embeddings are not at the output of the network

Paper • 2504.13181 • Published Apr 17 • 34

Summarization of Multimodal Presentations with Vision-Language Models: Study of the Effect of Modalities and Structure

Paper • 2504.10049 • Published Apr 14 • 3

upvoted 2 articles 4 months ago

Article

Open R1: Update #3

and 9 others •

Mar 11

• 294

Article

Welcome Gemma 3: Google's all new multimodal, multilingual, long context open LLM

and 3 others •

Mar 12

• 439

upvoted a paper 4 months ago

EuroBERT: Scaling Multilingual Encoders for European Languages

Paper • 2503.05500 • Published Mar 7 • 81

upvoted 2 articles 4 months ago

Article

Introducing EuroBERT: A High-Performance Multilingual Encoder Model

and 3 others •

Mar 10

• 144

Article

A Deepdive into Aya Vision: Advancing the Frontier of Multilingual Multimodality

and 3 others •

Mar 4

• 75

upvoted a paper 4 months ago

Qwen2.5-VL Technical Report

Paper • 2502.13923 • Published Feb 19 • 193

upvoted an article 4 months ago

Article

SigLIP 2: A better multilingual vision language encoder

and 2 others •

Feb 21

• 172

upvoted 2 papers 5 months ago

Ignore the KL Penalty! Boosting Exploration on Critical Tokens to Enhance RL Fine-Tuning

Paper • 2502.06533 • Published Feb 10 • 18

SmolLM2: When Smol Goes Big -- Data-Centric Training of a Small Language Model

Paper • 2502.02737 • Published Feb 4 • 235

upvoted an article 5 months ago

Article

Open-R1: a fully open reproduction of DeepSeek-R1

and 2 others •

Jan 28

• 870

upvoted an article 7 months ago

Article

EuroLLM-9B

and 5 others •

Dec 2, 2024

• 122

upvoted a paper 9 months ago

EuroLLM: Multilingual Language Models for Europe

Paper • 2409.16235 • Published Sep 24, 2024 • 26

upvoted 2 papers 10 months ago

Contextual Position Encoding: Learning to Count What's Important

Paper • 2405.18719 • Published May 29, 2024 • 5

Building and better understanding vision-language models: insights and future directions

Paper • 2408.12637 • Published Aug 22, 2024 • 132

upvoted 2 papers 11 months ago

MiniCPM-V: A GPT-4V Level MLLM on Your Phone

Paper • 2408.01800 • Published Aug 3, 2024 • 85

Harvesting Textual and Structured Data from the HAL Publication Repository

Paper • 2407.20595 • Published Jul 30, 2024 • 22

Théo Gigant

AI & ML interests

Recent Activity

Organizations

gigant's activity

nanoVLM: The simplest repository to train your VLM in pure PyTorch

Vision Language Models (Better, Faster, Stronger)

Open R1: Update #3

Welcome Gemma 3: Google's all new multimodal, multilingual, long context open LLM

Introducing EuroBERT: A High-Performance Multilingual Encoder Model

A Deepdive into Aya Vision: Advancing the Frontier of Multilingual Multimodality

SigLIP 2: A better multilingual vision language encoder

Open-R1: a fully open reproduction of DeepSeek-R1

EuroLLM-9B