QeRL: Beyond Efficiency -- Quantization-enhanced Reinforcement Learning for LLMs Paper • 2510.11696 • Published 5 days ago • 154
Artificial Hippocampus Networks for Efficient Long-Context Modeling Paper • 2510.07318 • Published 10 days ago • 26
Why Low-Precision Transformer Training Fails: An Analysis on Flash Attention Paper • 2510.04212 • Published 13 days ago • 22
VLA-R1: Enhancing Reasoning in Vision-Language-Action Models Paper • 2510.01623 • Published 16 days ago • 7
The Dragon Hatchling: The Missing Link between the Transformer and Models of the Brain Paper • 2509.26507 • Published 18 days ago • 483
MinerU2.5: A Decoupled Vision-Language Model for Efficient High-Resolution Document Parsing Paper • 2509.22186 • Published 22 days ago • 118
SWE-QA: Can Language Models Answer Repository-level Code Questions? Paper • 2509.14635 • Published about 1 month ago • 36
TrustJudge: Inconsistencies of LLM-as-a-Judge and How to Alleviate Them Paper • 2509.21117 • Published 23 days ago • 29
The Illusion of Diminishing Returns: Measuring Long Horizon Execution in LLMs Paper • 2509.09677 • Published Sep 11 • 33
Parallel-R1: Towards Parallel Thinking via Reinforcement Learning Paper • 2509.07980 • Published Sep 9 • 98
Revolutionizing Reinforcement Learning Framework for Diffusion Large Language Models Paper • 2509.06949 • Published Sep 8 • 56
view article Article Fine-Tuning Your First Large Language Model (LLM) with PyTorch and Hugging Face By dvgodoy • Feb 11 • 76