view article Article Bringing Fusion Down to Earth: ML for Stellarator Optimization By cgeorgiaw β’ 4 days ago β’ 57
FineWeb2: One Pipeline to Scale Them All -- Adapting Pre-Training Data Processing to Every Language Paper β’ 2506.20920 β’ Published 10 days ago β’ 57
view article Article Enhance Your Models in 5 Minutes with the Hugging Face Kernel Hub By drbh and 6 others β’ 24 days ago β’ 105
view article Article SmolVLA: Efficient Vision-Language-Action Model trained on Lerobot Community Data By danaaubakirova and 8 others β’ Jun 3 β’ 175
view article Article Cosmopedia: how to create large-scale synthetic data for pre-training Large Language Models By loubnabnl and 2 others β’ Mar 20, 2024 β’ 96
view article Article Tiny Agents: a MCP-powered agent in 50 lines of code By julien-c β’ Apr 25 β’ 284
view article Article Atlaset Dataset for Moroccan Darija: From Data Collection, Analysis, to Model Trainings By atlasia and 1 other β’ Mar 6 β’ 25
view article Article MiniMax-01 is Now Open-Source: Scaling Lightning Attention for the AI Agent Era By MiniMax-AI β’ Jan 15 β’ 47
MMTEB: Massive Multilingual Text Embedding Benchmark Paper β’ 2502.13595 β’ Published Feb 19 β’ 37
view article Article The Open Arabic LLM Leaderboard 2 By alielfilali01 and 7 others β’ Feb 10 β’ 33
view article Article Arabic RAG Leaderboard: A Comprehensive Framework for Evaluating Arabic Language Retrieval Systems By Navid-AI and 1 other β’ Feb 9 β’ 12
view article Article Darija Chatbot Arena: Making LLMs Compete in the Moroccan Dialect By atlasia and 2 others β’ Feb 10 β’ 14
SmolLM2: When Smol Goes Big -- Data-Centric Training of a Small Language Model Paper β’ 2502.02737 β’ Published Feb 4 β’ 235
view article Article Open-R1: a fully open reproduction of DeepSeek-R1 By eliebak and 2 others β’ Jan 28 β’ 870
Atlas-Chat: Adapting Large Language Models for Low-Resource Moroccan Arabic Dialect Paper β’ 2409.17912 β’ Published Sep 26, 2024 β’ 29
view article Article Yay! Organizations can now publish blog Articles By huggingface and 3 others β’ Jan 20 β’ 46
view article Article TerjamaBench: A Cultural Benchmark for English-Darija Machine Translation By imomayiz and 4 others β’ Jan 10 β’ 33