new

Get trending papers in your email inbox once a day!

Get trending papers in your email inbox!

Daily Papers

byAK and the research community

Jun 30

Submitted by

cccjc

BlenderFusion: 3D-Grounded Visual Editing and Generative Compositing

·
5 authors

1

Submitted by

StarJiaxing

LLaVA-Scissor: Token Compression with Semantic Connected Components for Video LLMs

·
4 authors

Submitted by

Mengyi

XVerse: Consistent Multi-Subject Control of Identity and Semantic Attributes via DiT Modulation

·
7 authors

Submitted by

THUdyh

ShotBench: Expert-Level Cinematic Understanding in Vision-Language Models

·
14 authors

Submitted by

ChengyouJia

From Ideal to Real: Unified and Data-Efficient Dense Prediction for Real-World Scenarios

·
4 authors

Submitted by

hba123

Ark: An Open-source Python-based Framework for Robot Learning

·
13 authors

Submitted by

AdinaY

Pangu Pro MoE: Mixture of Grouped Experts for Efficient Sparsity

·
22 authors

Submitted by

LeoLau

Shape-for-Motion: Precise and Consistent Video Editing with 3D Proxy

·
5 authors

Submitted by

SivanSX

Fine-Grained Preference Optimization Improves Spatial Reasoning in VLMs

·
9 authors

Submitted by

xichenhku

MiCo: Multi-image Contrast for Reinforcement Visual Reasoning

·
8 authors

1

Submitted by

tennant

The Automated LLM Speedrunning Benchmark: Reproducing NanoGPT Improvements

·
23 authors

Submitted by

AhmedMostafa

Gazal-R1: Achieving State-of-the-Art Medical Reasoning with Parameter-Efficient Two-Stage Training

·
3 authors

Submitted by

Luo-Yihong

Noise Consistency Training: A Native Approach for One-Step Generator in Learning Additional Controls

·
4 authors

Submitted by

nomadlx

Confucius3-Math: A Lightweight High-Performance Reasoning LLM for Chinese K-12 Mathematics Learning

·
5 authors

1

Submitted by

DanielWurgaft

In-Context Learning Strategies Emerge Rationally

·
6 authors

1

Submitted by

mdmoor

SMMILE: An Expert-Driven Benchmark for Multimodal Medical In-Context Learning

·
12 authors

Submitted by

Srizzle

Performance Prediction for Large Systems via Text-to-Text Regression

·
10 authors

Submitted by

Inevitablevalor

Spatial Mental Modeling from Limited Views

·
14 authors

Submitted by

j-morano

RetFiner: A Vision-Language Refinement Scheme for Retinal Foundation Models

·
4 authors

Submitted by

Srikumar26

Global and Local Entailment Learning for Natural World Imagery

·
5 authors

Submitted by

pengxiang

GPAS: Accelerating Convergence of LLM Pretraining via Gradient-Preserving Activation Scaling

·
15 authors