Inference-Time Scaling for Generalist Reward Modeling Paper • 2504.02495 • Published 12 days ago • 52
GenPRM: Scaling Test-Time Compute of Process Reward Models via Generative Reasoning Paper • 2504.00891 • Published 14 days ago • 12
DAPO: An Open-Source LLM Reinforcement Learning System at Scale Paper • 2503.14476 • Published 28 days ago • 117
RWKV-7 "Goose" with Expressive Dynamic State Evolution Paper • 2503.14456 • Published 28 days ago • 137
Perceive, Understand and Restore: Real-World Image Super-Resolution with Autoregressive Multimodal Generative Models Paper • 2503.11073 • Published Mar 14 • 1
ART: Anonymous Region Transformer for Variable Multi-Layer Transparent Image Generation Paper • 2502.18364 • Published Feb 25 • 36
Enhancing Language Multi-Agent Learning with Multi-Agent Credit Re-Assignment for Interactive Environment Generalization Paper • 2502.14496 • Published Feb 20
DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning Paper • 2501.12948 • Published Jan 22 • 381
AIGS: Generating Science from AI-Powered Automated Falsification Paper • 2411.11910 • Published Nov 17, 2024
Enhancing Multilingual Capabilities of Large Language Models through Self-Distillation from Resource-Rich Languages Paper • 2402.12204 • Published Feb 19, 2024 • 1
Enabling Weak LLMs to Judge Response Reliability via Meta Ranking Paper • 2402.12146 • Published Feb 19, 2024
An Extensible Plug-and-Play Method for Multi-Aspect Controllable Text Generation Paper • 2212.09387 • Published Dec 19, 2022
Towards Unified Alignment Between Agents, Humans, and Environment Paper • 2402.07744 • Published Feb 12, 2024 • 3
Dynamic LLM-Agent Network: An LLM-agent Collaboration Framework with Agent Team Optimization Paper • 2310.02170 • Published Oct 3, 2023 • 2
SEABO: A Simple Search-Based Method for Offline Imitation Learning Paper • 2402.03807 • Published Feb 6, 2024
Eagle and Finch: RWKV with Matrix-Valued States and Dynamic Recurrence Paper • 2404.05892 • Published Apr 8, 2024 • 39