WebDancer: Towards Autonomous Information Seeking Agency Paper • 2505.22648 • Published 16 days ago • 19
AI Agents vs. Agentic AI: A Conceptual Taxonomy, Applications and Challenge Paper • 2505.10468 • Published 29 days ago • 9
Running on L4 18 18 TEN Agent with VAD and Turn Detection 🔥 A Conversational Voice AI Agent powered by the TEN Framework
ZeroSearch: Incentivize the Search Capability of LLMs without Searching Paper • 2505.04588 • Published May 7 • 64
Phi-4-Mini-Reasoning: Exploring the Limits of Small Reasoning Language Models in Math Paper • 2504.21233 • Published Apr 30 • 46
Phi-4-Mini Technical Report: Compact yet Powerful Multimodal Language Models via Mixture-of-LoRAs Paper • 2503.01743 • Published Mar 3 • 88
DeepSolution: Boosting Complex Engineering Solution Design via Tree-based Exploration and Bi-point Thinking Paper • 2502.20730 • Published Feb 28 • 38
Can 1B LLM Surpass 405B LLM? Rethinking Compute-Optimal Test-Time Scaling Paper • 2502.06703 • Published Feb 10 • 153
Running on Zero 489 489 Chat with DeepSeek-VL2-small 🌍 Generate responses using images and text input
DeepRAG: Thinking to Retrieval Step by Step for Large Language Models Paper • 2502.01142 • Published Feb 3 • 24
view post Post 1462 VideoLLaMA 3🔥multimodal foundation models for Image and Video Understanding by DAMO Alibaba Model: DAMO-NLP-SG/videollama3-678cdda9281a0e32fe79af15Paper: VideoLLaMA 3: Frontier Multimodal Foundation Models for Image and Video Understanding (2501.13106)✨ 2B/7B✨ Apache2.0 See translation 1 reply · 🔥 3 3 + Reply
PaSa: An LLM Agent for Comprehensive Academic Paper Search Paper • 2501.10120 • Published Jan 17 • 51