Scaling Computer-Use Grounding via User Interface Decomposition and Synthesis Paper • 2505.13227 • Published May 19 • 45
Beyond 'Aha!': Toward Systematic Meta-Abilities Alignment in Large Reasoning Models Paper • 2505.10554 • Published May 15 • 119
BLIP3-o: A Family of Fully Open Unified Multimodal Models-Architecture, Training and Dataset Paper • 2505.09568 • Published May 14 • 94
BOLT: Bootstrap Long Chain-of-Thought in Language Models without Distillation Paper • 2502.03860 • Published Feb 6 • 24
Reward-Guided Speculative Decoding for Efficient LLM Reasoning Paper • 2501.19324 • Published Jan 31 • 40
Demystifying Domain-adaptive Post-training for Financial LLMs Paper • 2501.04961 • Published Jan 9 • 11
MathHay: An Automated Benchmark for Long-Context Mathematical Reasoning in LLMs Paper • 2410.04698 • Published Oct 7, 2024 • 13
Summary of a Haystack: A Challenge to Long-Context LLMs and RAG Systems Paper • 2407.01370 • Published Jul 1, 2024 • 90
AgentOhana: Design Unified Data and Training Pipeline for Effective Agent Learning Paper • 2402.15506 • Published Feb 23, 2024 • 17
Unlocking Anticipatory Text Generation: A Constrained Approach for Faithful Decoding with Large Language Models Paper • 2312.06149 • Published Dec 11, 2023 • 3
Diffusion Model Alignment Using Direct Preference Optimization Paper • 2311.12908 • Published Nov 21, 2023 • 50
Lemur: Harmonizing Natural Language and Code for Language Agents Paper • 2310.06830 • Published Oct 10, 2023 • 34
BOLAA: Benchmarking and Orchestrating LLM-augmented Autonomous Agents Paper • 2308.05960 • Published Aug 11, 2023 • 19
Retroformer: Retrospective Large Language Agents with Policy Gradient Optimization Paper • 2308.02151 • Published Aug 4, 2023 • 19