Absolute Zero: Reinforced Self-play Reasoning with Zero Data Paper • 2505.03335 • Published 7 days ago • 118
Qwen3 Collection Qwen's new Qwen3 models. In Unsloth Dynamic 2.0, GGUF, 4-bit and 16-bit Safetensor formats. Includes 128K Context Length variants. • 65 items • Updated about 19 hours ago • 140
BitNet Collection 🔥BitNet family of large language models (1-bit LLMs). • 7 items • Updated 12 days ago • 38
Granite Experiments Collection Experimental projects under consideration for the Granite family. • 16 items • Updated 11 days ago • 12
Granite 3.3 Language Models Collection Our latest language models licensed under Apache 2.0 license. • 4 items • Updated 11 days ago • 33
ParetoQ: Scaling Laws in Extremely Low-bit LLM Quantization Paper • 2502.02631 • Published Feb 4 • 3
Native Sparse Attention: Hardware-Aligned and Natively Trainable Sparse Attention Paper • 2502.11089 • Published Feb 16 • 157
BitNet: Scaling 1-bit Transformers for Large Language Models Paper • 2310.11453 • Published Oct 17, 2023 • 102
Can 1B LLM Surpass 405B LLM? Rethinking Compute-Optimal Test-Time Scaling Paper • 2502.06703 • Published Feb 10 • 152