AI agents are transforming how we interact with technology, but how sustainable are they? 🌍
Design choices — like model size and structure — can massively impact energy use and cost. ⚡💰 The key takeaway: smaller, task-specific models can be far more efficient than large, general-purpose ones.
🔑 Open-source models offer greater transparency, allowing us to track energy consumption and make more informed decisions on deployment. 🌱 Open-source = more efficient, eco-friendly, and accountable AI.
Hey folks! Just launched DeepGit Lite — a lighter version of DeepGit with fewer components under the hood. It won’t perform quite like the full powerhouse, but it’s great for a quick peek and first-hand feel! ⚙️👀
Give it a spin and tell us what you think! 👉 Try it here zamal/DeepGit-lite #opensource #DeepGit #gradio #githubresearch
DeepGit: Your GitHub Gold Digger! 💰🚀 Hey Hugging Face gang! Meet DeepGit—my open-source sidekick that rips through GitHub to snag repos that fit you. Done with dead-end searches? Me too. Built it with LangGraph and some dope tricks: Embeddings grab the good stuff (HF magic, baby!)
Re-ranking nails the best picks
Snoops docs, code, and buzz in one slick flow
Drops a clean list of hidden gems 💎
Unearth that sneaky ML lib or Python gem—run python app.py or langgraph dev and boom! Peek it at https://github.com/zamalali/DeepGit. Fork it, tweak it, love it—Docker’s in, HF vibes are strong. Drop a 🌟 or a crazy idea—I’m pumped to jam with you all! 🪂
🚀 ftBoost is LIVE – Stop Struggling with Fine-Tuning Data!
Alright folks, if you’re tired of manually crafting fine-tuning datasets, ftBoost is here to do the heavy lifting. One-click, LangChain-Groq-powered data augmentation that scales your training data in OpenAI, Gemini, Mistral, and LLaMA formats—automatically.
🔥 What’s inside? ✅ Smart Augmentations – Paraphrasing, back translation, synonym swapping & synthetic noise. ✅ No more JSONL headaches – Auto-formats everything for OpenAI, Gemini, Mistral & LLaMA. ✅ Custom tuning – Adjust similarity, diversity, and fluency in real-time. ✅ Upload, generate, download – That’s it.
⚡ If you’re fine-tuning LLMs, this will save you hours.
Interact with your PDF documents like never before! 🤯 Extract text & images, then ask context-aware questions based on both. Powered by RAG techniques & multimodal LLMs. Perfect for studying, research & more! 📝👀 Try it out now!!!! ✍️
Community fine-tuned models are more carbon efficient than the models they are derived from! 🥳🌿
@alozowski@clefourrier@SaylorTwift@albertvillanova evaluated CO₂ emissions associated with model inference for over 3000 models on the Open LLM Leaderboard. Interesting trends and new insights emerged...👀
a new experimental model that unlocks stronger reasoning capabilities and shows its thoughts. The model plans (with thoughts visible), can solve complex problems with Flash speeds, and more
I'm biased but I think HF Posts is the #1 social platform for the AI community! 🤗 That being said, most of us are already on X and now also joining Bluesky.
This full-fledged model is perfect for advanced image and text interactions, with zero GPU required. The Deepseek VL-1.3B Chat typically needs around 8 GB of VRAM and storage of almost 4 GB, but now you can experience it hassle-free right on our space!
Want something lighter? We’ve also uploaded a 4 bit quantized version (just around 1GB!), available on my profile. Perfect for those with limited hardware. 🌍🔍
Come try it now and see what this model can do! 🚀✨
zamal/Molmo-4bit Thrilled to announce that the Molmo 7B 4-bit Space is now live! 🚀 The model size has been reduced by six times with almost no performance loss, and the results will leave you amazed!
It runs on zero GPU, making it incredibly accessible for everyone!
zamal/Molmo-7B-GPTQ-4bit model is now available for all! This model has been highly quantized, reducing its size by almost six times. It now occupies significantly less space and vRAM, making it perfect for deployment on resource-constrained devices without compromising performance.
Now we get: Efficient Performance: Maintains high accuracy while being highly quantized. Reduced Size: The model size is reduced by nearly six times, optimizing storage and memory usage. Versatile Application: Ideal for integrating a powerful visual language model into various projects particularly multi rag chains. Check it out!
The consistency model (CM) has recently made significant progress in accelerating the generation of diffusion models. However, its application to high-resolution, text-conditioned image generation in the latent space (a.k.a., LCM) remains unsatisfactory. In this paper, we identify three key flaws in the current design of LCM. We investigate the reasons behind these limitations and propose the Phased Consistency Model (PCM), which generalizes the design space and addresses all identified limitations. Our evaluations demonstrate that PCM significantly outperforms LCM across 1--16 step generation settings. While PCM is specifically designed for multi-step refinement, it achieves even superior or comparable 1-step generation results to previously state-of-the-art specifically designed 1-step methods. Furthermore, we show that PCM's methodology is versatile and applicable to video generation, enabling us to train the state-of-the-art few-step text-to-video generator.
We present Chameleon, a family of early-fusion token-based mixed-modal models capable of understanding and generating images and text in any arbitrary sequence. We outline a stable training approach from inception, an alignment recipe, and an architectural parameterization tailored for the early-fusion, token-based, mixed-modal setting. The models are evaluated on a comprehensive range of tasks, including visual question answering, image captioning, text generation, image generation, and long-form mixed modal generation. Chameleon demonstrates broad and general capabilities, including state-of-the-art performance in image captioning tasks, outperforms Llama-2 in text-only tasks while being competitive with models such as Mixtral 8x7B and Gemini-Pro, and performs non-trivial image generation, all in a single model. It also matches or exceeds the performance of much larger models, including Gemini Pro and GPT-4V, according to human judgments on a new long-form mixed-modal generation evaluation, where either the prompt or outputs contain mixed sequences of both images and text. Chameleon marks a significant step forward in a unified modeling of full multimodal documents.