view article Article No GPU left behind: Unlocking Efficiency with Co-located vLLM in TRL By toslali-ibm and 5 others • Jun 3 • 79
view article Article Training and Finetuning Sparse Embedding Models with Sentence Transformers v5 By tomaarsen and 1 other • Jul 1 • 106
Absolute Zero: Reinforced Self-play Reasoning with Zero Data Paper • 2505.03335 • Published May 6 • 181
view article Article From Zero to Reasoning Hero: How DeepSeek-R1 Leverages Reinforcement Learning to Master Complex Reasoning By NormalUhr • Feb 4 • 16
view article Article Efficient LLM Pretraining: Packed Sequences and Masked Attention By sirluk • Oct 7, 2024 • 45
JudgeBench: A Benchmark for Evaluating LLM-based Judges Paper • 2410.12784 • Published Oct 16, 2024 • 49