view article Article Open-source DeepResearch – Freeing our search agents By m-ric and 4 others • Feb 4 • 1.26k
view article Article Visualize and understand GPU memory in PyTorch By qgallouedec • Dec 24, 2024 • 227
view article Article ZebraLogic: Benchmarking the Logical Reasoning Ability of Language Models By yuchenlin • Jul 27, 2024 • 33
view article Article Fine-tuning Florence-2 - Microsoft's Cutting-edge Vision Language Models By andito and 2 others • Jun 24, 2024 • 197
Still-Moving: Customized Video Generation without Customized Video Data Paper • 2407.08674 • Published Jul 11, 2024 • 13
view article Article Preference Optimization for Vision Language Models By qgallouedec and 3 others • Jul 10, 2024 • 79
view article Article Powerful ASR + diarization + speculative decoding with Hugging Face Inference Endpoints By sergeipetrov and 3 others • May 1, 2024 • 77
Fast High-Resolution Image Synthesis with Latent Adversarial Diffusion Distillation Paper • 2403.12015 • Published Mar 18, 2024 • 68
⚓️ Sailor Language Models Collection Sailor: Open Language Models tailored for South-East Asia (SEA) released by Sea AI Lab. • 17 items • Updated Dec 3, 2024 • 17
Text-to-Image Base Models Collection All text-to-image open source base models, with their respective license • 28 items • Updated May 10, 2024 • 25
BASE TTS: Lessons from building a billion-parameter Text-to-Speech model on 100K hours of data Paper • 2402.08093 • Published Feb 12, 2024 • 62
DITTO: Diffusion Inference-Time T-Optimization for Music Generation Paper • 2401.12179 • Published Jan 22, 2024 • 22
Masked Audio Generation using a Single Non-Autoregressive Transformer Paper • 2401.04577 • Published Jan 9, 2024 • 44
SIGNeRF: Scene Integrated Generation for Neural Radiance Fields Paper • 2401.01647 • Published Jan 3, 2024 • 13
MagicDance: Realistic Human Dance Video Generation with Motions & Facial Expressions Transfer Paper • 2311.12052 • Published Nov 18, 2023 • 32