Delta Activations: A Representation for Finetuned Large Language Models Paper • 2509.04442 • Published 3 days ago • 3
Loong: Synthesize Long Chain-of-Thoughts at Scale through Verifiers Paper • 2509.03059 • Published 4 days ago • 15
DeepResearch Arena: The First Exam of LLMs' Research Abilities via Seminar-Grounded Tasks Paper • 2509.01396 • Published 6 days ago • 45
Qwen-Image-Exp-LoRA Collection Illustration Design, Style Intermix • 7 items • Updated about 13 hours ago • 3
Video-MTR: Reinforced Multi-Turn Reasoning for Long Video Understanding Paper • 2508.20478 • Published 10 days ago • 16
Transition Models: Rethinking the Generative Learning Objective Paper • 2509.04394 • Published 3 days ago • 19
Inverse IFEval: Can LLMs Unlearn Stubborn Training Conventions to Follow Real Instructions? Paper • 2509.04292 • Published 3 days ago • 45
Towards a Unified View of Large Language Model Post-Training Paper • 2509.04419 • Published 3 days ago • 53
Drivel-ology: Challenging LLMs with Interpreting Nonsense with Depth Paper • 2509.03867 • Published 3 days ago • 165
UI-TARS-2 Technical Report: Advancing GUI Agent with Multi-Turn Reinforcement Learning Paper • 2509.02544 • Published 5 days ago • 102
SimpleTIR: End-to-End Reinforcement Learning for Multi-Turn Tool-Integrated Reasoning Paper • 2509.02479 • Published 5 days ago • 76
How Can Input Reformulation Improve Tool Usage Accuracy in a Complex Dynamic Environment? A Study on τ-bench Paper • 2508.20931 • Published 10 days ago • 15
PVPO: Pre-Estimated Value-Based Policy Optimization for Agentic Reasoning Paper • 2508.21104 • Published 10 days ago • 28
T2R-bench: A Benchmark for Generating Article-Level Reports from Real World Industrial Tables Paper • 2508.19813 • Published 11 days ago • 20
Optimal Sparsity of Mixture-of-Experts Language Models for Reasoning Tasks Paper • 2508.18672 • Published 12 days ago • 9
ReportBench: Evaluating Deep Research Agents via Academic Survey Tasks Paper • 2508.15804 • Published 24 days ago • 14
TreePO: Bridging the Gap of Policy Optimization and Efficacy and Inference Efficiency with Heuristic Tree-based Modeling Paper • 2508.17445 • Published 14 days ago • 78