Running 115 115 FineVision: Open Data is All You Need ๐ A new open-source dataset for training VLMs
Gated Associative Memory: A Parallel O(N) Architecture for Efficient Sequence Modeling Paper โข 2509.00605 โข Published 11 days ago โข 41
Beyond Transcription: Mechanistic Interpretability in ASR Paper โข 2508.15882 โข Published 20 days ago โข 84
view article Article Advanced Flux Dreambooth LoRA Training with ๐งจ diffusers By linoyts and 1 other โข Oct 21, 2024 โข 42
view article Article SmolLM3: smol, multilingual, long-context reasoner By loubnabnl and 22 others โข Jul 8 โข 654
Running 3.17k 3.17k The Ultra-Scale Playbook ๐ The ultimate guide to training LLM on large GPU Clusters
Running 1.06k 1.06k FineWeb: decanting the web for the finest text data at scale ๐ท Generate high-quality web text data for LLM training
view article Article cocogold: training Marigold for text-grounded segmentation By pcuenq โข Jul 8 โข 30
view article Article Train 400x faster Static Embedding Models with Sentence Transformers By tomaarsen โข Jan 15 โข 209