new

Get trending papers in your email inbox once a day!

Get trending papers in your email inbox!

Daily Papers

byAK and the research community

Jun 3

Submitted by

shenzhi-wang

Beyond the 80/20 Rule: High-Entropy Minority Tokens Drive Effective Reinforcement Learning for LLM Reasoning

·
18 authors

3

Submitted by

andito

SmolVLA: A Vision-Language-Action Model for Affordable and Efficient Robotics

·
14 authors

Submitted by

zafstojano

REASONING GYM: Reasoning Environments for Reinforcement Learning with Verifiable Rewards

·
7 authors

4

Submitted by

ZedongWangAI

Taming LLMs by Scaling Learning Rates with Gradient Grouping

·
7 authors

4

Submitted by

kinam0252

Temporal In-Context Fine-Tuning for Versatile Control of Video Diffusion Models

·
3 authors

3

Submitted by

che111

SRPO: Enhancing Multimodal LLM Reasoning via Reflection-Aware Reinforcement Learning

·
13 authors

2

Submitted by

yejunliang23

ShapeLLM-Omni: A Native Multimodal LLM for 3D Generation and Understanding

·
5 authors

2

Submitted by

karrykkk

LoHoVLA: A Unified Vision-Language-Action Model for Long-Horizon Embodied Tasks

·
5 authors

2

Submitted by

rhyang2021

ARIA: Training Language Agents with Intention-Driven Reward Aggregation

·
8 authors

2

Submitted by

wangzifu

Jigsaw-R1: A Study of Rule-based Visual Reinforcement Learning with Jigsaw Puzzles

·
7 authors

2

Submitted by

lemonaddie

Learning Video Generation for Robotic Manipulation with Collaborative Trajectory Control

·
8 authors

Submitted by

sy1998

EarthMind: Towards Multi-Granular and Multi-Sensor Earth Observation with Large Multimodal Models

·
8 authors

2

Submitted by

xssstory

AReaL: A Large-Scale Asynchronous Reinforcement Learning System for Language Reasoning

·
13 authors

Submitted by

alexandraww

Unified Scaling Laws for Compressed Representations

·
6 authors

2

Submitted by

Ray2333

MiCRo: Mixture Modeling and Context-aware Routing for Personalized Preference Learning

·
8 authors

2

Submitted by

yolay

Incentivizing Reasoning for Advanced Instruction-Following of Large Language Models

·
9 authors

2

Submitted by

arnodjiang

IVY-FAKE: A Unified Explainable Framework and Benchmark for Image and Video AIGC Detection

·
6 authors

3

Submitted by

yeonseokjeong

From Token to Action: State Machine Reasoning to Mitigate Overthinking in Information Retrieval

·
3 authors

2

Submitted by

MasterZhou

Reasoning Like an Economist: Post-Training on Economic Problems Induces Strategic Generalization in LLMs

·
10 authors

2

Submitted by

Amirhossein-Alimohammadi

Cora: Correspondence-aware image editing using few step diffusion

·
6 authors

2

Submitted by

AtsuMiyai

WebChoreArena: Evaluating Web Browsing Agents on Realistic Tedious Web Tasks

·
12 authors

3

Submitted by

zhangchenxu

VisualSphinx: Large-Scale Synthetic Vision Logic Puzzles for RL

·
8 authors

2

Submitted by

pyf98

OWSM v4: Improving Open Whisper-Style Speech Models via Data Scaling and Cleaning

·
7 authors

2

Submitted by

zd11024

Learning from Videos for 3D World: Enhancing MLLMs with 3D Vision Geometry Priors

·
4 authors

2

Submitted by

alemiaschi

Stress-testing Machine Generated Text Detection: Shifting Language Models Writing Style to Fool Detectors

·
7 authors

2

Submitted by

dihuang

CodeV-R1: Reasoning-Enhanced Verilog Generation

·
19 authors

2

Submitted by

yizecheng

DyePack: Provably Flagging Test Set Contamination in LLMs Using Backdoors

·
4 authors

2

Submitted by

ChenDY

Normalized Attention Guidance: Universal Negative Guidance for Diffusion Model

·
4 authors

3

Submitted by

s-sahoo

Esoteric Language Models

·
10 authors

Submitted by

Saibo-creator

zip2zip: Inference-Time Adaptive Vocabularies for Language Models via Token Compression

·
7 authors

2

Submitted by

Shengran

Darwin Godel Machine: Open-Ended Evolution of Self-Improving Agents

·
5 authors

2

Submitted by

FreaxRuby

WHEN TO ACT, WHEN TO WAIT: Modeling Structural Trajectories for Intent Triggerability in Task-Oriented Dialogue

·
8 authors

2

Submitted by

iliashum

Cascading Adversarial Bias from Injection to Distillation in Language Models

·
6 authors

Submitted by

vinthony

VAU-R1: Advancing Video Anomaly Understanding via Reinforcement Fine-Tuning

·
4 authors

Submitted by

xwjzds

SATA-BENCH: Select All That Apply Benchmark for Multiple Choice Questions

·
6 authors

2

Submitted by

CNcreator0331

Pro3D-Editor : A Progressive-Views Perspective for Consistent and Precise 3D Editing

·
4 authors

2

Submitted by

Taoer

Stepsize anything: A unified learning rate schedule for budgeted-iteration training

·
5 authors

2

Submitted by

Omartificial-Intelligence-Space

From Guidelines to Practice: A New Paradigm for Arabic Language Model Evaluation

·
6 authors

3

Submitted by

Omartificial-Intelligence-Space

From Guidelines to Practice: A New Paradigm for Arabic Language Model Evaluation

·
6 authors

3

Submitted by

shuzyuan

LLM in the Loop: Creating the PARADEHATE Dataset for Hate Speech Detoxification

·
7 authors

3

Submitted by

Rabinovich

RARE: Retrieval-Aware Robustness Evaluation for Retrieval-Augmented Generation Systems

·
8 authors

2

Submitted by

matthieufp

ComposeAnything: Composite Object Priors for Text-to-Image Generation

·
3 authors

Submitted by

bing-li-ai

OmniResponse: Online Multimodal Conversational Response Generation in Dyadic Interactions

·
5 authors

2

Submitted by

kargaranamir

How Programming Concepts and Neurons Are Shared in Code Language Models

·
4 authors

2

Submitted by

tuvu

SealQA: Raising the Bar for Reasoning in Search-Augmented Language Models

·
6 authors

2

Submitted by

shash42

Pitfalls in Evaluating Language Model Forecasters

·
4 authors

Submitted by

domiso

SenseFlow: Scaling Distribution Matching for Flow-based Text-to-Image Distillation

·
7 authors

2

Submitted by

vickywu

MaskSearch: A Universal Pre-Training Framework to Enhance Agentic Search Capability

·
9 authors

Submitted by

itaynakash

Think Again! The Effect of Test-Time Compute on Preferences, Opinions, and Beliefs of Large Language Models

·
4 authors

2

Submitted by

YongqiLi

Aligning VLM Assistants with Personalized Situated Cognition

·
12 authors

2

Submitted by

Shiweiliuiiiiiii

LIFT the Veil for the Truth: Principal Weights Emerge after Rank Reduction for Reasoning-Focused Supervised Fine-Tuning

·
8 authors

Submitted by

JJ-TMT

CityLens: Benchmarking Large Language-Vision Models for Urban Socioeconomic Sensing

·
7 authors

2

Submitted by

jisx

Massively Multilingual Adaptation of Large Language Models Using Bilingual Translation Data

·
6 authors

2

Submitted by

xiaobinzhuang

MagiCodec: Simple Masked Gaussian-Injected Codec for High-Fidelity Reconstruction and Generation

·
12 authors

2

Submitted by

attentionisallyouneed369

Neuro2Semantic: A Transfer Learning Framework for Semantic Reconstruction of Continuous Language from Human Intracranial EEG

·
6 authors

2

Submitted by

susanliang

BinauralFlow: A Causal and Streamable Approach for High-Quality Binaural Speech Synthesis with Flow Matching Models

·
10 authors

Submitted by

yongchao98

R1-Code-Interpreter: Training LLMs to Reason with Code via Supervised and Reinforcement Learning

·
7 authors

2

Submitted by

chtmp223

Frankentext: Stitching random text fragments into long-form narratives

·
4 authors

Submitted by

junhongmit

Plan and Budget: Effective and Efficient Test-Time Scaling on Large Language Model Reasoning

·
7 authors

2

Submitted by

mgolov

Pixels Versus Priors: Controlling Knowledge Priors in Vision-Language Models through Visual Counterfacts

·
6 authors

Submitted by

PoTaTo721

MIKU-PAL: An Automated and Standardized Multi-Modal Method for Speech Paralinguistic and Affect Labeling

·
3 authors

Submitted by

prasannareddyp

Shuffle PatchMix Augmentation with Confidence-Margin Weighted Pseudo-Labels for Enhanced Source-Free Domain Adaptation

·
6 authors

2

Submitted by

Floki00

Synthesis of discrete-continuous quantum circuits with multimodal diffusion models

·
5 authors

2