new

Get trending papers in your email inbox once a day!

Get trending papers in your email inbox!

Daily Papers

byAK and the research community

Jun 12

Submitted by

kuznetsoffandrey

Confidence Is All You Need: Few-Shot RL Fine-Tuning of Language Models

·
5 authors

Submitted by

wujie10

Seedance 1.0: Exploring the Boundaries of Video Generation Models

·
44 authors

5

Submitted by

Hanyuezhuohua

Multiverse: Your Language Models Secretly Decide How to Parallelize and Merge Generation

·
5 authors

Submitted by

imryanxu

ComfyUI-R1: Exploring Reasoning Models for Workflow Generation

·
8 authors

Submitted by

akhaliq

Autoregressive Adversarial Post-Training for Real-Time Interactive Video Generation

·
9 authors

Submitted by

xichenhku

PlayerOne: Egocentric World Simulator

·
6 authors

2

Submitted by

hassid

Auto-Regressive vs Flow-Matching: a Comparative Study of Modeling Paradigms for Text-to-Music Generation

·
3 authors

Submitted by

LongMountain

SeerAttention-R: Sparse Attention Adaptation for Long Reasoning

·
15 authors

Submitted by

ChengpengLi

CoRT: Code-integrated Reasoning within Thinking

·
11 authors

Submitted by

Lemoncoke

SWE-Flow: Synthesizing Software Engineering Data in a Test-Driven Manner

·
9 authors

Submitted by

jy-yuan

Give Me FP32 or Give Me Death? Challenges and Solutions for Reproducible Reasoning

·
10 authors

Submitted by

zhenzhiwang

InterActHuman: Multi-Concept Human Animation with Layout-Aligned Audio Conditions

·
8 authors

4

Submitted by

niveck

Time to Talk: LLM Agents for Asynchronous Group Communication in Mafia Games

·
3 authors

Submitted by

WaltonFuture

Vision Matters: Simple Visual Perturbations Can Boost Multimodal Math Reasoning

·
7 authors

Submitted by

guqiao

SAFE: Multitask Failure Detection for Vision-Language-Action Models

·
7 authors

Submitted by

ashawkey

Efficient Part-level 3D Object Generation via Dual Volume Packing

·
10 authors

Submitted by

taesiri

Hidden in plain sight: VLMs overlook their visual representations

·
4 authors

Submitted by

NikV09

UFM: A Simple Path towards Unified Dense Correspondence with Flow

·
12 authors

Submitted by

sungwon95

Cross-Frame Representation Alignment for Fine-Tuning Video Diffusion Models

·
5 authors

Submitted by

Zory

Can Vision Language Models Infer Human Gaze Direction? A Controlled Study

·
10 authors

Submitted by

j-morano

MIRAGE: Multimodal foundation model and benchmark for comprehensive retinal OCT image analysis

·
10 authors

Submitted by

wy1iu

Reparameterized LLM Training via Orthogonal Equivalence Transformation

·
6 authors

Submitted by

Lihuchen

Query-Level Uncertainty in Large Language Models

·
2 authors

2

Submitted by

SushantGautam

Kvasir-VQA-x1: A Multimodal Dataset for Medical Reasoning and Robust MedVQA in Gastrointestinal Endoscopy

·
3 authors

Submitted by

pranamanam

Branched Schrödinger Bridge Matching

·
4 authors

Submitted by

fangwu97

When to Trust Context: Self-Reflective Debates for Context Reliability

·
8 authors

Submitted by

Prakamya

TTT-Bench: A Benchmark for Evaluating Reasoning Ability with Simple and Novel Tic-Tac-Toe-style Games

·
6 authors

2

Submitted by

TreeForest

A Call for Collaborative Intelligence: Why Human-Agent Systems Should Precede AI Autonomy

·
13 authors