DelTA: Discriminative Token Credit Assignment for Reinforcement Learning from Verifiable Rewards Paper • 2605.21467 • Published 6 days ago • 200
Gated DeltaNet-2: Decoupling Erase and Write in Linear Attention Paper • 2605.22791 • Published 5 days ago • 24
Darwin Family: MRI-Trust-Weighted Evolutionary Merging for Training-Free Scaling of Language-Model Reasoning Paper • 2605.14386 • Published 12 days ago • 59
MinT: Managed Infrastructure for Training and Serving Millions of LLMs Paper • 2605.13779 • Published 13 days ago • 217
AnyFlow: Any-Step Video Diffusion Model with On-Policy Flow Map Distillation Paper • 2605.13724 • Published 13 days ago • 96
Mean Mode Screaming: Mean--Variance Split Residuals for 1000-Layer Diffusion Transformers Paper • 2605.06169 • Published 19 days ago • 229
Flow-OPD: On-Policy Distillation for Flow Matching Models Paper • 2605.08063 • Published 18 days ago • 97
World-R1: Reinforcing 3D Constraints for Text-to-Video Generation Paper • 2604.24764 • Published 29 days ago • 118
TriAttention: Efficient Long Reasoning with Trigonometric KV Compression Paper • 2604.04921 • Published Apr 6 • 114
Gemma 4 Uncensored Collection Abliterated Gemma 4 models with refusal behavior removed. Biprojection + EGA for MoE. Cross-validated against 686 prompts from 4 datasets. • 8 items • Updated Apr 5 • 86
FIPO: Eliciting Deep Reasoning with Future-KL Influenced Policy Optimization Paper • 2603.19835 • Published Mar 20 • 351
PackForcing: Short Video Training Suffices for Long Video Sampling and Long Context Inference Paper • 2603.25730 • Published Mar 26 • 53
From Sparse to Dense: Multi-View GRPO for Flow Models via Augmented Condition Space Paper • 2603.12648 • Published Mar 13 • 14
Qwen3.5 Unredacted MAX Collection Continual “abliteration” models – experimental. • 8 items • Updated 29 days ago • 4