Cut Your Losses in Large-Vocabulary Language Models Paper • 2411.09009 • Published 4 days ago • 26 • 4
Direct Preference Optimization Using Sparse Feature-Level Constraints Paper • 2411.07618 • Published 5 days ago • 13 • 3
EgoVid-5M: A Large-Scale Video-Action Dataset for Egocentric Video Generation Paper • 2411.08380 • Published 4 days ago • 22 • 3
Large Language Models Can Self-Improve in Long-context Reasoning Paper • 2411.08147 • Published 5 days ago • 47 • 4
IOPO: Empowering LLMs with Complex Instruction Following via Input-Output Preference Optimization Paper • 2411.06208 • Published 8 days ago • 18 • 6
Edify Image: High-Quality Image Generation with Pixel Space Laplacian Diffusion Models Paper • 2411.07126 • Published 6 days ago • 27 • 5
Chinese SimpleQA: A Chinese Factuality Evaluation for Large Language Models Paper • 2411.07140 • Published 6 days ago • 33 • 3
OmniEdit: Building Image Editing Generalist Models Through Specialist Supervision Paper • 2411.07199 • Published 6 days ago • 42 • 5
Add-it: Training-Free Object Insertion in Images With Pretrained Diffusion Models Paper • 2411.07232 • Published 6 days ago • 56 • 4
StdGEN: Semantic-Decomposed 3D Character Generation from Single Images Paper • 2411.05738 • Published 9 days ago • 13 • 3
Balancing Pipeline Parallelism with Vocabulary Parallelism Paper • 2411.05288 • Published 10 days ago • 18 • 3
OS-ATLAS: A Foundation Action Model for Generalist GUI Agents Paper • 2410.23218 • Published 18 days ago • 46 • 3
Language Models can Self-Lengthen to Generate Long Texts Paper • 2410.23933 • Published 17 days ago • 15 • 3
A Pointer Network-based Approach for Joint Extraction and Detection of Multi-Label Multi-Class Intents Paper • 2410.22476 • Published 19 days ago • 24 • 3
What Happened in LLMs Layers when Trained for Fast vs. Slow Thinking: A Gradient Perspective Paper • 2410.23743 • Published 17 days ago • 58 • 4
Unpacking SDXL Turbo: Interpreting Text-to-Image Models with Sparse Autoencoders Paper • 2410.22366 • Published 20 days ago • 73 • 3
HuatuoGPT-Vision, Towards Injecting Medical Visual Knowledge into Multimodal LLMs at Scale Paper • 2406.19280 • Published Jun 27 • 60 • 9
BigCodeBench: Benchmarking Code Generation with Diverse Function Calls and Complex Instructions Paper • 2406.15877 • Published Jun 22 • 45 • 8