view article Article Falcon-H1: A Family of Hybrid-Head Language Models Redefining Efficiency and Performance By tiiuae and 5 others β’ May 21 β’ 28
HuggingFace's Transformers: State-of-the-art Natural Language Processing Paper β’ 1910.03771 β’ Published Oct 9, 2019 β’ 19
RecurrentGemma: Moving Past Transformers for Efficient Open Language Models Paper β’ 2404.07839 β’ Published Apr 11, 2024 β’ 48
StarCoder 2 and The Stack v2: The Next Generation Paper β’ 2402.19173 β’ Published Feb 29, 2024 β’ 147
view article Article Welcome to Inference Providers on the Hub π₯ By julien-c and 6 others β’ Jan 28 β’ 483
view article Article SmolVLM Grows Smaller β Introducing the 250M & 500M Models! By andito and 2 others β’ Jan 23 β’ 181
view article Article Mastering Long Contexts in LLMs with KVPress By nvidia and 1 other β’ Jan 23 β’ 68
view article Article Timm β€οΈ Transformers: Use any timm model with transformers By ariG23498 and 4 others β’ Jan 16 β’ 50
Molmo Collection Artifacts for open multimodal language models. β’ 5 items β’ Updated Apr 30 β’ 305
view article Article Don't repeat yourself - π€ Transformers Design Philosophy By patrickvonplaten β’ Apr 5, 2022 β’ 35
Meta Llama 3 Collection This collection hosts the transformers and original repos of the Meta Llama 3 and Llama Guard 2 releases β’ 5 items β’ Updated Dec 6, 2024 β’ 796
Qwen1.5 Collection Qwen1.5 is the improved version of Qwen, the large language model series developed by Alibaba Cloud. β’ 55 items β’ Updated Apr 28 β’ 209
MambaByte: Token-free Selective State Space Model Paper β’ 2401.13660 β’ Published Jan 24, 2024 β’ 60
Canonical models Collection This collection lists all the historical (pre-"Hub") canonical model checkpoints, i.e. repos that were not under an org or user namespace β’ 68 items β’ Updated Feb 13, 2024 β’ 14