Chain-of-Agents: End-to-End Agent Foundation Models via Multi-Agent Distillation and Agentic RL Paper • 2508.13167 • Published 14 days ago • 68
AnyCap Project: A Unified Framework, Dataset, and Benchmark for Controllable Omni-modal Captioning Paper • 2507.12841 • Published Jul 17 • 40
Evaluating Vision-Language Models as Evaluators in Path Planning Paper • 2411.18711 • Published Nov 27, 2024
VisualWebInstruct: Scaling up Multimodal Instruction Data through Web Search Paper • 2503.10582 • Published Mar 13 • 24
Scaling Evaluation-time Compute with Reasoning Models as Process Evaluators Paper • 2503.19877 • Published Mar 25 • 1
VisualPuzzles: Decoupling Multimodal Reasoning Evaluation from Domain Knowledge Paper • 2504.10342 • Published Apr 14 • 11
Speculative Thinking: Enhancing Small-Model Reasoning with Large Model Guidance at Inference Time Paper • 2504.12329 • Published Apr 12
The CoT Encyclopedia: Analyzing, Predicting, and Controlling how a Reasoning Model will Think Paper • 2505.10185 • Published May 15 • 26
VisCoder: Fine-Tuning LLMs for Executable Python Visualization Code Generation Paper • 2506.03930 • Published Jun 4 • 25
Does Math Reasoning Improve General LLM Capabilities? Understanding Transferability of LLM Reasoning Paper • 2507.00432 • Published Jul 1 • 73
Does Math Reasoning Improve General LLM Capabilities? Understanding Transferability of LLM Reasoning Paper • 2507.00432 • Published Jul 1 • 73
Assessor360: Multi-sequence Network for Blind Omnidirectional Image Quality Assessment Paper • 2305.10983 • Published May 18, 2023 • 1
Toward Generalized Image Quality Assessment: Relaxing the Perfect Reference Quality Assumption Paper • 2503.11221 • Published Mar 14 • 1
MANIQA: Multi-dimension Attention Network for No-Reference Image Quality Assessment Paper • 2204.08958 • Published Apr 19, 2022 • 1