SmolDocling: An ultra-compact vision-language model for end-to-end multi-modal document conversion Paper โข 2503.11576 โข Published 12 days ago โข 76
view article Article A Deepdive into Aya Vision: Advancing the Frontier of Multilingual Multimodality 22 days ago โข 69
SmolLM2: When Smol Goes Big -- Data-Centric Training of a Small Language Model Paper โข 2502.02737 โข Published Feb 4 โข 214
Aya Model: An Instruction Finetuned Open-Access Multilingual Language Model Paper โข 2402.07827 โข Published Feb 12, 2024 โข 47
Naijaweb datasets ๐ณ๐ฌ Collection A recreation of the fineweb collection for Nigerians โข 3 items โข Updated Oct 24, 2024 โข 6
OpenCulture Collection A multilingual dataset of public domain books and newspapers. โข 27 items โข Updated Nov 6, 2024 โข 124
Distil-Whisper: Robust Knowledge Distillation via Large-Scale Pseudo Labelling Paper โข 2311.00430 โข Published Nov 1, 2023 โข 59