Qwen2.5-VL Collection Vision-language model series based on Qwen2.5 • 3 items • Updated 4 days ago • 287
ComplexFuncBench: Exploring Multi-Step and Constrained Function Calling under Long-Context Scenario Paper • 2501.10132 • Published 14 days ago • 17
Video Depth Anything: Consistent Depth Estimation for Super-Long Videos Paper • 2501.12375 • Published 9 days ago • 22
UI-TARS: Pioneering Automated GUI Interaction with Native Agents Paper • 2501.12326 • Published 9 days ago • 47
view article Article Introducing smolagents: simple agents that write actions in code. Dec 31, 2024 • 536
view article Article Train 400x faster Static Embedding Models with Sentence Transformers 16 days ago • 129
view article Article Llama-3.1-Storm-8B: Improved SLM with Self-Curation + Model Merging By akjindal53244 • Aug 19, 2024 • 76
OmniThink: Expanding Knowledge Boundaries in Machine Writing through Thinking Paper • 2501.09751 • Published 14 days ago • 47
FAST: Efficient Action Tokenization for Vision-Language-Action Models Paper • 2501.09747 • Published 14 days ago • 23
HALoGEN: Fantastic LLM Hallucinations and Where to Find Them Paper • 2501.08292 • Published 16 days ago • 17
Diffusion Adversarial Post-Training for One-Step Video Generation Paper • 2501.08316 • Published 16 days ago • 32
MiniMax-01: Scaling Foundation Models with Lightning Attention Paper • 2501.08313 • Published 16 days ago • 271
Multimodal LLMs Can Reason about Aesthetics in Zero-Shot Paper • 2501.09012 • Published 15 days ago • 10
Towards Best Practices for Open Datasets for LLM Training Paper • 2501.08365 • Published 16 days ago • 51
MMDocIR: Benchmarking Multi-Modal Retrieval for Long Documents Paper • 2501.08828 • Published 15 days ago • 30
Graph Mamba: Towards Learning on Graphs with State Space Models Paper • 2402.08678 • Published Feb 13, 2024 • 15
Granite Time Series Models Collection A collection of time series models trained by IBM licensed under Apache 2.0 license. • 5 items • Updated Dec 18, 2024 • 26