Introducing Observers: AI Observability with Hugging Face datasets through a lightweight SDK Nov 21, 2024 • 35
view article Article 🚀 Build a Qwen 2.5 VL API endpoint with Hugging Face spaces and Docker! By ariG23498 • 1 day ago • 13
Qwen2.5-VL Collection Vision-language model series based on Qwen2.5 • 3 items • Updated 3 days ago • 277
view article Article Index and retrieve documents for vector search using Sentence Transformers and DuckDB By davidberenstein1957 • 3 days ago • 1
view article Article Explore, Curate and Vector Search Any Hugging Face Dataset with Nomic Atlas By MaxNomic • 7 days ago • 29
Follow The Money Collection https://docs.google.com/presentation/d/1heWC_K_vqWmK5W4Un1aK_wY-aywmjmp6di6vPAn3bns/edit?usp=sharing • 4 items • Updated 8 days ago • 1
view article Article Yay! Organizations can now publish blog Articles By huggingface • 10 days ago • 30
Towards Best Practices for Open Datasets for LLM Training Paper • 2501.08365 • Published 16 days ago • 51
view article Article Train 400x faster Static Embedding Models with Sentence Transformers 15 days ago • 128
view article Article Beyond Image Preferences - Rich Human Feedback for Text-to-Image Generation By RapidataAI • 21 days ago • 13
view article Article Crowd-sourced Open Preference Dataset for Text-to-Image Generation By RapidataAI • 23 days ago • 18
view article Article Fine-tune a SmolLM on domain-specific synthetic data from a LLM By davidberenstein1957 • 27 days ago • 32
view article Article Fine-tune ModernBERT for text classification using synthetic data By davidberenstein1957 • about 1 month ago • 28
SmolLM2 Collection State-of-the-art compact LLMs for on-device applications: 1.7B, 360M, 135M • 15 items • Updated Dec 22, 2024 • 208
Synthetic Data Generator Collection A collection of tools and datasets related to no-code the Synthetic Data Generation. • 19 items • Updated 10 days ago • 7