Chinese SimpleQA: A Chinese Factuality Evaluation for Large Language Models Paper • 2411.07140 • Published 6 days ago • 33
MemoRAG: Moving towards Next-Gen RAG Via Memory-Inspired Knowledge Discovery Paper • 2409.05591 • Published Sep 9 • 29
OneGen: Efficient One-Pass Unified Generation and Retrieval for LLMs Paper • 2409.05152 • Published Sep 8 • 30
Document Parsing Unveiled: Techniques, Challenges, and Prospects for Structured Information Extraction Paper • 2410.21169 • Published 20 days ago • 29
Can Models Help Us Create Better Models? Evaluating LLMs as Data Scientists Paper • 2410.23331 • Published 18 days ago • 7
Flow-DPO: Improving LLM Mathematical Reasoning through Online Multi-Agent Learning Paper • 2410.22304 • Published 19 days ago • 14
AutoKaggle: A Multi-Agent Framework for Autonomous Data Science Competitions Paper • 2410.20424 • Published 21 days ago • 37
AAAR-1.0: Assessing AI's Potential to Assist Research Paper • 2410.22394 • Published 19 days ago • 13
AndroidLab: Training and Systematic Benchmarking of Android Autonomous Agents Paper • 2410.24024 • Published 17 days ago • 48
Vision Search Assistant: Empower Vision-Language Models as Multimodal Search Engines Paper • 2410.21220 • Published 20 days ago • 8
Are LLMs Better than Reported? Detecting Label Errors and Mitigating Their Effect on Model Performance Paper • 2410.18889 • Published 24 days ago • 15
Fictitious Synthetic Data Can Improve LLM Factuality via Prerequisite Learning Paper • 2410.19290 • Published 24 days ago • 10
ROCKET-1: Master Open-World Interaction with Visual-Temporal Context Prompting Paper • 2410.17856 • Published 25 days ago • 49
Scaling Diffusion Language Models via Adaptation from Autoregressive Models Paper • 2410.17891 • Published 25 days ago • 15
Unbounded: A Generative Infinite Game of Character Life Simulation Paper • 2410.18975 • Published 24 days ago • 34
Optima: Optimizing Effectiveness and Efficiency for LLM-Based Multi-Agent System Paper • 2410.08115 • Published Oct 10 • 7