view article Article SmolLM3: smol, multilingual, long-context reasoner By loubnabnl and 22 others • 25 days ago • 602
view article Article No GPU left behind: Unlocking Efficiency with Co-located vLLM in TRL By toslali-ibm and 5 others • Jun 3 • 79
ERNIE 4.5 Collection collection of ERNIE 4.5 models. "-Paddle" models use PaddlePaddle weights, while "-PT" models use Transformer-style PyTorch weights. • 25 items • Updated 22 days ago • 157
ReasonFlux-PRM: Trajectory-Aware PRMs for Long Chain-of-Thought Reasoning in LLMs Paper • 2506.18896 • Published Jun 23 • 28
Expanding RL with Verifiable Rewards Across Diverse Domains Paper • 2503.23829 • Published Mar 31 • 24
view article Article Qwen3 X ModelScope Toolkit: Faster Training + Comprehensive Evaluation By tastelikefeet • May 8 • 1
LongGenBench: Long-context Generation Benchmark Paper • 2410.04199 • Published Oct 5, 2024 • 22
Phi-4 (All Versions) Collection Microsoft's Phi-4 models including Reasoning + Reasoning Plus & mini. Includes Dynamic 2.0 GGUF, 4-bit & 16-bit versions. Includes Unsloth's bug fixes • 20 items • Updated 3 days ago • 73
view article Article SmolLM - blazingly fast and remarkably powerful By loubnabnl and 2 others • Jul 16, 2024 • 401