Brigitte Tousignant

BrigitteTousi

AI & ML interests

None yet

Recent Activity

reacted to tomaarsen's post with ❀️ about 13 hours ago
An assembly of 18 European companies, labs, and universities have banded together to launch πŸ‡ͺπŸ‡Ί EuroBERT! It's a state-of-the-art multilingual encoder for 15 European languages, designed to be finetuned for retrieval, classification, etc. πŸ‡ͺπŸ‡Ί 15 Languages: English, French, German, Spanish, Chinese, Italian, Russian, Polish, Portuguese, Japanese, Vietnamese, Dutch, Arabic, Turkish, Hindi 3️⃣ 3 model sizes: 210M, 610M, and 2.1B parameters - very very useful sizes in my opinion ➑️ Sequence length of 8192 tokens! Nice to see these higher sequence lengths for encoders becoming more common. βš™οΈ Architecture based on Llama, but with bi-directional (non-causal) attention to turn it into an encoder. Flash Attention 2 is supported. πŸ”₯ A new Pareto frontier (stronger *and* smaller) for multilingual encoder models πŸ“Š Evaluated against mDeBERTa, mGTE, XLM-RoBERTa for Retrieval, Classification, and Regression (after finetuning for each task separately): EuroBERT punches way above its weight. πŸ“ Detailed paper with all details, incl. data: FineWeb for English and CulturaX for multilingual data, The Stack v2 and Proof-Pile-2 for code. Check out the release blogpost here: https://huggingface.co/blog/EuroBERT/release * https://huggingface.co/EuroBERT/EuroBERT-210m * https://huggingface.co/EuroBERT/EuroBERT-610m * https://huggingface.co/EuroBERT/EuroBERT-2.1B The next step is for researchers to build upon the 3 EuroBERT base models and publish strong retrieval, zero-shot classification, etc. models for all to use. I'm very much looking forward to it!
View all activity

Organizations

Hugging Face's profile picture Society & Ethics's profile picture HuggingFaceM4's profile picture Open-Source AI Meetup's profile picture BigCode's profile picture Hugging Face OSS Metrics's profile picture IBM-NASA Prithvi Models Family's profile picture Hugging Face TB Research's profile picture Wikimedia Movement's profile picture LeRobot's profile picture Women on Hugging Face's profile picture Journalists on Hugging Face's profile picture Social Post Explorers's profile picture Dev Mode Explorers's profile picture Hugging Face Science's profile picture Coordination Nationale pour l'IA's profile picture open/ acc's profile picture Bluesky Community's profile picture Sandbox's profile picture Open R1's profile picture

Posts 3

view post
Post
703
Regardless of X being down or not, so glad I can rely on HF Posts for AI news β€οΈπŸ€—

Articles 3

Article
293

Open-R1: Update #1

models

None public yet

datasets

None public yet