Towards Best Practices for Open Datasets for LLM Training Paper • 2501.08365 • Published 4 days ago • 40
HALoGEN: Fantastic LLM Hallucinations and Where to Find Them Paper • 2501.08292 • Published 4 days ago • 16
OpenCSG Chinese Corpus: A Series of High-quality Chinese Datasets for LLM Training Paper • 2501.08197 • Published 4 days ago • 7
Eliciting In-context Retrieval and Reasoning for Long-context Large Language Models Paper • 2501.08248 • Published 4 days ago • 1
LLaVA-Mini: Efficient Image and Video Large Multimodal Models with One Vision Token Paper • 2501.03895 • Published 11 days ago • 48
REINFORCE++: A Simple and Efficient Approach for Aligning Large Language Models Paper • 2501.03262 • Published 14 days ago • 82
Cosmos World Foundation Model Platform for Physical AI Paper • 2501.03575 • Published 11 days ago • 63
Multi-task retriever fine-tuning for domain-specific and efficient RAG Paper • 2501.04652 • Published 10 days ago • 10
EpiCoder: Encompassing Diversity and Complexity in Code Generation Paper • 2501.04694 • Published 10 days ago • 9
LLM4SR: A Survey on Large Language Models for Scientific Research Paper • 2501.04306 • Published 10 days ago • 33
Search-o1: Agentic Search-Enhanced Large Reasoning Models Paper • 2501.05366 • Published 9 days ago • 75
URSA: Understanding and Verifying Chain-of-thought Reasoning in Multimodal Mathematics Paper • 2501.04686 • Published 10 days ago • 48
Agent Laboratory: Using LLM Agents as Research Assistants Paper • 2501.04227 • Published 10 days ago • 77
rStar-Math: Small LLMs Can Master Math Reasoning with Self-Evolved Deep Thinking Paper • 2501.04519 • Published 10 days ago • 230
Towards System 2 Reasoning in LLMs: Learning How to Think With Meta Chain-of-Though Paper • 2501.04682 • Published 10 days ago • 83
An Empirical Study of Autoregressive Pre-training from Videos Paper • 2501.05453 • Published 9 days ago • 36
Virgo: A Preliminary Exploration on Reproducing o1-like MLLM Paper • 2501.01904 • Published 15 days ago • 31
Test-time Computing: from System-1 Thinking to System-2 Thinking Paper • 2501.02497 • Published 13 days ago • 40