Topic 33: Slim Attention, KArAt, XAttention and Multi-Token Attention Explained – What’s Really Changing in Transformers? By Kseniase and 1 other • 11 days ago • 14
Training a GPT-2 Language Model from Scratch for Moroccan Darija: An Educational Experiment in Low-Resource NLP By hassoudi • 11 days ago • 1
AI Agents + AI Automation: The Business Formula That's Changing Everything By Omartificial-Intelligence-Space • 11 days ago • 1
Enabling Long Context Training with Sequence Parallelism in Axolotl By axolotl-ai-co and 1 other • 12 days ago • 5
Optimise AI Models and Make Them Faster, Smaller, Cheaper, Greener By PrunaAI and 2 others • 12 days ago • 13
Training Large Language Models with Interpreter Feedback using WebAssembly By axolotl-ai-co and 1 other • 12 days ago • 11
Kurtis-E1.1: Supervised Fine-tuning of Qwen2.5-3B-Instruct with Flower.ai & Hugging Face By mrs83 • 13 days ago
Porting Pi0-FAST to LeRobot from JAX to PyTorch: Challenges, Fixes, and Open Questions By danaaubakirova and 3 others • 14 days ago • 8