view article Article State of open video generation models in Diffusers By sayakpaul and 2 others β’ Jan 27 β’ 54
view article Article How Long Prompts Block Other Requests - Optimizing LLM Performance By tngtech β’ 15 days ago β’ 2
view article Article Prefill and Decode for Concurrent Requests - Optimizing LLM Performance By tngtech β’ Apr 16 β’ 18
Beyond the 80/20 Rule: High-Entropy Minority Tokens Drive Effective Reinforcement Learning for LLM Reasoning Paper β’ 2506.01939 β’ Published 24 days ago β’ 162
view article Article Enabling Long Context Training with Sequence Parallelism in Axolotl By axolotl-ai-co and 1 other β’ Apr 4 β’ 9
view article Article SigLIP 2: A better multilingual vision language encoder By ariG23498 and 2 others β’ Feb 21 β’ 169
view article Article The case for specialized pre-training: ultra-fast foundation models for dedicated tasks By Pclanglais β’ Aug 4, 2024 β’ 30
Scotch & SOTA π₯ Pt. 7: Human Feedback Datasets π«£ Collection The elusive βhumanβ feedback β’ 1 item β’ Updated Sep 13, 2023 β’ 1
Scotch & SOTA π₯ Pt. 6: Dialogue Tuning Datasets π¬ Collection Conversations, turn-based dialog, and things that can be turned into that. β’ 4 items β’ Updated Sep 13, 2023 β’ 1
Scotch & SOTA π₯ Pt. 5: Instruction Tuning Datasets π©βπ« Collection Question & answer, task completion, general SFT and otherwise finetuney data. β’ 7 items β’ Updated Sep 13, 2023 β’ 1
view article Article How to deploy and fine-tune DeepSeek models on AWS By pagezyhf and 2 others β’ Jan 30 β’ 52
view article Article Can we create pedagogically valuable multi-turn synthetic datasets from Cosmopedia? By davanstrien β’ May 7, 2024 β’ 8
view article Article Train 400x faster Static Embedding Models with Sentence Transformers By tomaarsen β’ Jan 15 β’ 195
Tulu 3 Datasets Collection All datasets released with Tulu 3 -- state of the art open post-training recipes. β’ 33 items β’ Updated Apr 30 β’ 86
PixMo Collection A set of vision-language datasets built by Ai2 and used to train the Molmo family of models. Read more at https://molmo.allenai.org/blog β’ 10 items β’ Updated Apr 30 β’ 72
Gemma 2: Improving Open Language Models at a Practical Size Paper β’ 2408.00118 β’ Published Jul 31, 2024 β’ 78