A Comparative Study on Automatic Coding of Medical Letters with Explainability Paper • 2407.13638 • Published Jul 18 • 5
Internet of Agents: Weaving a Web of Heterogeneous Agents for Collaborative Intelligence Paper • 2407.07061 • Published Jul 9 • 26
AgentInstruct: Toward Generative Teaching with Agentic Flows Paper • 2407.03502 • Published Jul 3 • 48
Graph-Based Captioning: Enhancing Visual Descriptions by Interconnecting Region Captions Paper • 2407.06723 • Published Jul 9 • 10
Lookback Lens: Detecting and Mitigating Contextual Hallucinations in Large Language Models Using Only Attention Maps Paper • 2407.07071 • Published Jul 9 • 11
HoloDreamer: Holistic 3D Panoramic World Generation from Text Descriptions Paper • 2407.15187 • Published Jul 21 • 10
Optimus-1: Hybrid Multimodal Memory Empowered Agents Excel in Long-Horizon Tasks Paper • 2408.03615 • Published Aug 7 • 30
MedTrinity-25M: A Large-scale Multimodal Dataset with Multigranular Annotations for Medicine Paper • 2408.02900 • Published Aug 6 • 25
GMAI-MMBench: A Comprehensive Multimodal Evaluation Benchmark Towards General Medical AI Paper • 2408.03361 • Published Aug 6 • 85
The AI Scientist: Towards Fully Automated Open-Ended Scientific Discovery Paper • 2408.06292 • Published Aug 12 • 115
OpenResearcher: Unleashing AI for Accelerated Scientific Research Paper • 2408.06941 • Published Aug 13 • 30
LLM-3D Print: Large Language Models To Monitor and Control 3D Printing Paper • 2408.14307 • Published Aug 26 • 3
Writing in the Margins: Better Inference Pattern for Long Context Retrieval Paper • 2408.14906 • Published Aug 27 • 138
LongCite: Enabling LLMs to Generate Fine-grained Citations in Long-context QA Paper • 2409.02897 • Published Sep 4 • 44
Paper Copilot: A Self-Evolving and Efficient LLM System for Personalized Academic Assistance Paper • 2409.04593 • Published Sep 6 • 22
Minstrel: Structural Prompt Generation with Multi-Agents Coordination for Non-AI Experts Paper • 2409.13449 • Published Sep 20 • 10
Imagine yourself: Tuning-Free Personalized Image Generation Paper • 2409.13346 • Published Sep 20 • 67
LEOPARD : A Vision Language Model For Text-Rich Multi-Image Tasks Paper • 2410.01744 • Published Oct 2 • 25
MMed-RAG: Versatile Multimodal RAG System for Medical Vision Language Models Paper • 2410.13085 • Published Oct 16 • 20