Hey HuggingFace, love your open source attitude and particularly transformers.js for embedding models! Your current integration "use this model" gives you the transformers.js code, but there is no quick way to really test a model in one click. SemanticFinder (do-me/SemanticFinder) offers such an integration for all compatible feature-extraction models! All you need to do is add a URL parameter with the model ID to it, like so: https://do-me.github.io/SemanticFinder/?model=Xenova/bge-small-en-v1.5. You can also decide between quantized and normal mode with https://do-me.github.io/SemanticFinder/?model=Xenova/bge-small-en-v1.5&quantized=false. Maybe that would do for a HF integration? I know it's a small open source project, but I really believe that it provides value for devs before deciding for one model or the other. Also, it's much easier than having to spin up a notebook, install dependencies etc.. It's private, so you could even do some real-world evaluation on personal data without having to worry about third-party services data policies. Happy to hear the community's thoughts!
π Meet MergeUI - an All-in-one UI for Exploring Merged LLMs on Hugging Face π€!
Model merging is a cool new technique for creating powerful language models for cheap (no GPU required). But it raises questions like: - Which models should we merge? - What merge strategies work best? - How do different base models affect performance?
With MergeUI, you can easily: - Visualise the family tree and lineage of any merged model. - Explore benchmark performance of family trees from the Open LLM Leaderboard. - Analyse the different merge strategies used. - Check license information for merged models and their ancestors.
All this helps you explore and understand merged models, uncover valuable insights, and make better decisions for your projects.
β Size Consistency: While Krakenβs size increases with more Experts, Kraken-LoRA remains as compact as the base model (e.g., 8b if you use Meta-Llama3-8b-Instruct). β VRAM Efficiency: Kraken-LoRA is highly VRAM efficient, maintaining the power of all experts without the bloat. β Dynamic Adaptation: LoRA adapters are applied dynamically at runtime, following the routing process. β High Efficiency: Enjoy increased efficiency without compromising performance, as long as the LoRA adapters match the base model.
π‘ Conclusion: Kraken-LoRA empowers businesses to experience enhanced flexibility and performance from our architecture, enabling further scalability without sacrificing performance.
πππ New Research Alert - Gaussian Head & Shoulders (Avatars Collection)! πππ π Title: Gaussian Head & Shoulders: High Fidelity Neural Upper Body Avatars with Anchor Gaussian Guided Texture Warping π
π Description: Gaussian Head & Shoulders is a method for creating high-fidelity upper body avatars by integrating 3D morphable head models with a neural texture warping approach to overcome the limitations of Gaussian splatting.