Beyond Context Limits: Subconscious Threads for Long-Horizon Reasoning Paper • 2507.16784 • Published 7 days ago • 111
Stabilizing Knowledge, Promoting Reasoning: Dual-Token Constraints for RLVR Paper • 2507.15778 • Published 9 days ago • 19
Seq vs Seq: An Open Suite of Paired Encoders and Decoders Paper • 2507.11412 • Published 15 days ago • 25
Go to Zero: Towards Zero-shot Motion Generation with Million-scale Data Paper • 2507.07095 • Published 20 days ago • 53
MIRIX: Multi-Agent Memory System for LLM-Based Agents Paper • 2507.07957 • Published 19 days ago • 59
Machine Bullshit: Characterizing the Emergent Disregard for Truth in Large Language Models Paper • 2507.07484 • Published 20 days ago • 16
ZeCO: Zero Communication Overhead Sequence Parallelism for Linear Attention Paper • 2507.01004 • Published 28 days ago • 10
Skywork-Reward-V2 Collection Scaling preference data curation to the extreme • 9 items • Updated 26 days ago • 20
KnowRL: Exploring Knowledgeable Reinforcement Learning for Factuality Paper • 2506.19807 • Published Jun 24 • 7
Can Large Language Models Capture Human Annotator Disagreements? Paper • 2506.19467 • Published Jun 24 • 18
EMLoC: Emulator-based Memory-efficient Fine-tuning with LoRA Correction Paper • 2506.12015 • Published Jun 13 • 4
ProRL: Prolonged Reinforcement Learning Expands Reasoning Boundaries in Large Language Models Paper • 2505.24864 • Published May 30 • 133