view article Article From Llasa to Llasagna π: Finetuning LLaSA to generates Italian speech and other languages By Steveeeeeeen and 1 other β’ 10 days ago β’ 22
Lina-Speech: Gated Linear Attention is a Fast and Parameter-Efficient Learner for text-to-speech synthesis Paper β’ 2410.23320 β’ Published Oct 30, 2024 β’ 8
view article Article Transformers.js v3: WebGPU support, new models & tasks, and more⦠Oct 22, 2024 ⒠67
Llama3-8B-1.58 Collection A trio of powerful models: fine-tuned from Llama3-8b-Instruct, with BitNet architecture! β’ 3 items β’ Updated Sep 14, 2024 β’ 11
view article Article Fine-tuning LLMs to 1.58bit: extreme quantization made easy Sep 18, 2024 β’ 223
Nemotron 4 340B Collection Nemotron-4: open models for Synthetic Data Generation (SDG). Includes Base, Instruct, and Reward models. β’ 4 items β’ Updated Jan 17 β’ 162
Embedding Model Datasets Collection A curated subset of the datasets that work out of the box with Sentence Transformers: https://huggingface.co/datasets?other=sentence-transformers β’ 68 items β’ Updated 7 days ago β’ 108
view article Article LLM Comparison/Test: Llama 3 Instruct 70B + 8B HF/GGUF/EXL2 (20 versions tested and compared!) By wolfram β’ Apr 24, 2024 β’ 61
Gemma release Collection Groups the Gemma models released by the Google team. β’ 40 items β’ Updated Dec 13, 2024 β’ 330
Canonical models Collection This collection lists all the historical (pre-"Hub") canonical model checkpoints, i.e. repos that were not under an org or user namespace β’ 68 items β’ Updated Feb 13, 2024 β’ 14
SigLIP Collection Contrastive (sigmoid) image-text models from https://arxiv.org/abs/2303.15343 β’ 10 items β’ Updated 2 days ago β’ 53
Switch-Transformers release Collection This release included various MoE (Mixture of expert) models, based on the T5 architecture . The base models use from 8 to 256 experts. β’ 9 items β’ Updated Dec 13, 2024 β’ 17