new

Get trending papers in your email inbox once a day!

Get trending papers in your email inbox!

Daily Papers

byAK and the research community

Jun 10

Submitted by

unilm

Reinforcement Pre-Training

·
7 authors

4

Submitted by

kenchan0226

Lingshu: A Generalist Foundation Model for Unified Multimodal Medical Understanding and Reasoning

·
19 authors

2

Submitted by

q-rz

Saffron-1: Towards an Inference Scaling Paradigm for LLM Safety Assurance

·
5 authors

1

Submitted by

xcjthu

MiniCPM4: Ultra-Efficient LLMs on End Devices

·
75 authors

1

Submitted by

wchengad

OneIG-Bench: Omni-dimensional Nuanced Evaluation for Image Generation

·
9 authors

1

Submitted by

bertjiazheng

SpatialLM: Training Large Language Models for Structured Indoor Modeling

·
8 authors

Submitted by

Elizaveta

Image Reconstruction as a Tool for Feature Analysis

·
4 authors

Submitted by

sc-bd

Astra: Toward General-Purpose Mobile Robots via Hierarchical Multimodal Learning

·
71 authors

2

Submitted by

GitBag

Pre-trained Large Language Models Learn Hidden Markov Models In-context

·
5 authors

2

Submitted by

cszy98

Rethinking Cross-Modal Interaction in Multimodal Diffusion Transformers

·
7 authors

Submitted by

hongyuw

BitVLA: 1-bit Vision-Language-Action Models for Robotics Manipulation

·
4 authors

Submitted by

Hoter

GTR-CoT: Graph Traversal as Visual Chain of Thought for Molecular Structure Recognition

·
12 authors

1

Submitted by

noystl

Debatable Intelligence: Benchmarking LLM Judges via Debate Speech Evaluation

·
5 authors

2

Submitted by

RogerLos

Through the Valley: Path to Effective Long CoT Training for Small Language Models

·
4 authors

1

Submitted by

parshinsh

The Illusion of Thinking: Understanding the Strengths and Limitations of Reasoning Models via the Lens of Problem Complexity

·
6 authors

1

Submitted by

ducdauge

Bootstrapping World Models from Dynamics Models in Multimodal Foundation Models

·
5 authors

2

Submitted by

nickjiang

Vision Transformers Don't Need Trained Registers

·
4 authors

Submitted by

MaggieHuang

ConfQA: Answer Only If You Are Confident

·
14 authors

1

Submitted by

ZacLiu

CCI4.0: A Bilingual Pretraining Dataset for Enhancing Reasoning in Large Language Models

·
9 authors

1

Submitted by

amberyzheng

Model Immunization from a Condition Number Perspective

·
4 authors

Submitted by

yunfeixie

Play to Generalize: Learning to Reason Through Game Play

·
6 authors

2

Submitted by

craigwu

GUI-Reflection: Empowering Multimodal GUI Models with Self-Reflection Behavior

·
6 authors

Submitted by

songff

Well Begun is Half Done: Low-resource Preference Alignment by Weak-to-Strong Decoding

·
7 authors

1

Submitted by

Sichengmo

Dreamland: Controllable World Creation with Simulator and Generative Models

·
6 authors

Submitted by

MichaelR207

SynthesizeMe! Inducing Persona-Guided Prompts for Personalized Reward Models in LLMs

·
6 authors

2

Submitted by

JieRuan

ExpertLongBench: Benchmarking Language Models on Expert-Level Long-Form Generation Tasks with Structured Checklists

·
17 authors

2

Submitted by

KaiserWhoLearns

What Is Seen Cannot Be Unseen: The Disruptive Effect of Knowledge Conflict on Large Language Models

·
3 authors

Submitted by

sabrieyuboglu

Cartridges: Lightweight and general-purpose long context representations via self-study

·
11 authors

2

Submitted by

xw-eric

Agents of Change: Self-Evolving LLM Agents for Strategic Planning

·
6 authors

Submitted by

Honghua

τ^2-Bench: Evaluating Conversational Agents in a Dual-Control Environment

·
5 authors

Submitted by

shuoxing

SAFEFLOW: A Principled Protocol for Trustworthy and Transactional Autonomous Agent Systems

·
12 authors

1

Submitted by

RoadQAQ

Learning What Reinforcement Learning Can't: Interleaved Online Fine-Tuning for Hardest Questions

·
11 authors

1

Submitted by

jesu9

Self-Adapting Improvement Loops for Robotic Learning

·
5 authors

Submitted by

lesleychou

NetPress: Dynamically Generated LLM Benchmarks for Network Applications

·
7 authors

Submitted by

BestWishYsh

PolyVivid: Vivid Multi-Subject Video Generation with Cross-Modal Interaction and Enhancement

·
7 authors

Submitted by

ItamarZ

Overclocking LLM Reasoning: Monitoring and Controlling Thinking Path Lengths in LLMs

·
3 authors

Submitted by

LibraTree

GeometryZero: Improving Geometry Solving for LLM with Group Contrastive Policy Optimization

·
7 authors

1

Submitted by

ZZXF

MegaHan97K: A Large-Scale Dataset for Mega-Category Chinese Character Recognition with over 97K Categories

·
6 authors

2

Submitted by

Sunshine279

Robust Preference Optimization via Dynamic Target Margins

·
8 authors

2

Submitted by

Hidir

Dynamic View Synthesis as an Inverse Problem

·
2 authors

Submitted by

marinero4972

CyberV: Cybernetics for Test-time Scaling in Video Understanding

·
7 authors

Submitted by

michaelchenkj

Improving large language models with concept-aware fine-tuning

·
4 authors

1

Submitted by

mchraba

Evaluating LLMs Robustness in Less Resourced Languages with Proxy Models

·
3 authors

1

Submitted by

chargoddard

Training-Free Tokenizer Transplantation via Orthogonal Matching Pursuit

·
2 authors

Submitted by

xw-eric

Hidden in Plain Sight: Probing Implicit Reasoning in Multimodal Language Models

·
7 authors

Submitted by

lizhuang144

EVOREFUSE: Evolutionary Prompt Optimization for Evaluation and Mitigation of LLM Over-Refusal to Pseudo-Malicious Instructions

·
9 authors

Submitted by

aksgupta97

Meta-Adaptive Prompt Distillation for Few-Shot Visual Question Answering

·
3 authors

1

Submitted by

594zyc

Proactive Assistant Dialogue Generation from Streaming Egocentric Videos

·
8 authors

2