view changelog Changelog Introducing HF Jobs: Run scalable compute jobs on Hugging Face 13 days ago • 90
view article Article A failed experiment: Infini-Attention, and why we should keep trying? By neuralink and 2 others • Aug 14, 2024 • 69
Native Sparse Attention: Hardware-Aligned and Natively Trainable Sparse Attention Paper • 2502.11089 • Published Feb 16 • 165
view article Article OpenReasoning-Nemotron: A Family of State-of-the-Art Distilled Reasoning Models By nvidia and 3 others • 25 days ago • 47
Radial Attention: O(nlog n) Sparse Attention with Energy Decay for Long Video Generation Paper • 2506.19852 • Published Jun 24 • 41
kz919/DeepSeek-R1-Distill-Qwen-1.5B-GRPO-Cautious-TRL-0.18.0.dev Text Generation • 2B • Updated Jun 9 • 3 • 1
kz919/DeepSeek-R1-Distill-Qwen-1.5B-GRPO-Cautious-TRL-0.18.0.dev Text Generation • 2B • Updated Jun 9 • 3 • 1
view post Post 2793 Anyone using AI and ML to help neurodivergent people? I'd love to hear what you're doing. See translation 4 replies · 👀 7 7 + Reply
kz919/DeepSeek-R1-Distill-Qwen-1.5B-GRPO-Cautious-TRL-0.18.0.dev Text Generation • 2B • Updated Jun 9 • 3 • 1