microsoft/Phi-4-multimodal-instruct Automatic Speech Recognition • Updated 2 days ago • 231k • 1.04k
Running 126 126 Qwen2.5 VL 72B Instruct 💻 Interact with Qwen2.5-VL-Chat model using text and files
LLaVA-o1: Let Vision Language Models Reason Step-by-Step Paper • 2411.10440 • Published Nov 15, 2024 • 114
QuantFactory/Llama-3.2-Taiwan-Legal-3B-Instruct-GGUF Text Generation • Updated Nov 2, 2024 • 3.28k • 11
Search-o1: Agentic Search-Enhanced Large Reasoning Models Paper • 2501.05366 • Published Jan 9 • 95
SFT Memorizes, RL Generalizes: A Comparative Study of Foundation Model Post-training Paper • 2501.17161 • Published Jan 28 • 108
Running on Zero 113 113 Llama3.1 S V0.2 Checkpoint 2024 08 20 😻 Convert text to audio and vice versa
bullerwins/gradientai_Llama-3-8B-Instruct-262k_exl2_8.0bpw Text Generation • Updated Apr 26, 2024 • 15 • 3
Phi-3 Technical Report: A Highly Capable Language Model Locally on Your Phone Paper • 2404.14219 • Published Apr 22, 2024 • 256
DocLLM: A layout-aware generative language model for multimodal document understanding Paper • 2401.00908 • Published Dec 31, 2023 • 180
Gaussian Head Avatar: Ultra High-fidelity Head Avatar via Dynamic Gaussians Paper • 2312.03029 • Published Dec 5, 2023 • 26