Time to Talk: LLM Agents for Asynchronous Group Communication in Mafia Games Paper • 2506.05309 • Published 9 days ago • 12
Auto-Regressive vs Flow-Matching: a Comparative Study of Modeling Paradigms for Text-to-Music Generation Paper • 2506.08570 • Published 5 days ago • 27
WHISTRESS: Enriching Transcriptions with Sentence Stress Detection Paper • 2505.19103 • Published 21 days ago • 13
Fast Text-to-Audio Generation with Adversarial Post-Training Paper • 2505.08175 • Published May 13 • 22
Follow the Flow: On Information Flow Across Textual Tokens in Text-to-Image Models Paper • 2504.01137 • Published Apr 1 • 21
Scaling Analysis of Interleaved Speech-Text Language Models Paper • 2504.02398 • Published Apr 3 • 29
Open-Qwen2VL: Compute-Efficient Pre-Training of Fully-Open Multimodal LLMs on Academic Resources Paper • 2504.00595 • Published Apr 1 • 36 • 7
Open-Qwen2VL: Compute-Efficient Pre-Training of Fully-Open Multimodal LLMs on Academic Resources Paper • 2504.00595 • Published Apr 1 • 36
Open-Reasoner-Zero: An Open Source Approach to Scaling Up Reinforcement Learning on the Base Model Paper • 2503.24290 • Published Mar 31 • 62
Single Image Iterative Subject-driven Generation and Editing Paper • 2503.16025 • Published Mar 20 • 14