The Common Pile v0.1: An 8TB Dataset of Public Domain and Openly Licensed Text Paper • 2506.05209 • Published 16 days ago • 40
🧠 Traditional Chinese Reasoning Datasets Collection A curated collection of datasets designed to evaluate and train reasoning capabilities in Traditional Chinese across various domains. • 3 items • Updated May 14 • 8
view article Article A Deepdive into Aya Vision: Advancing the Frontier of Multilingual Multimodality By saurabhdash and 3 others • Mar 4 • 75
Breeze 2 Family Collection Llama-Breeze2 is a multi-modal language model family specifically intended for Traditional Chinese use. BreezyVoice is a Taiwan Mandarin TTS • 6 items • Updated Feb 26 • 18
Cosmos-Tokenizer Collection A suite of image and video tokenizers • 13 items • Updated 4 days ago • 40
AIMv2 Collection A collection of AIMv2 vision encoders that supports a number of resolutions, native resolution, and a distilled checkpoint. • 19 items • Updated Nov 22, 2024 • 77
SmolLM2 Collection State-of-the-art compact LLMs for on-device applications: 1.7B, 360M, 135M • 16 items • Updated May 5 • 269
view article Article Efficient Deep Learning: A Comprehensive Overview of Optimization Techniques 👐 📚 By Isayoften • Aug 26, 2024 • 67
AgentPoison: Red-teaming LLM Agents via Poisoning Memory or Knowledge Bases Paper • 2407.12784 • Published Jul 17, 2024 • 52
🪐 SmolLM Collection A series of smol LLMs: 135M, 360M and 1.7B. We release base and Instruct models as well as the training corpus and some WebGPU demos • 12 items • Updated May 5 • 227
view article Article SmolLM - blazingly fast and remarkably powerful By loubnabnl and 2 others • Jul 16, 2024 • 378
Nemotron 4 340B Collection Nemotron-4: open models for Synthetic Data Generation (SDG). Includes Base, Instruct, and Reward models. • 4 items • Updated 4 days ago • 163
Llama3-ChatQA-1.5 Collection Llama3-ChatQA-1.5 models excel at conversational question answering (QA) and retrieval-augmented generation (RAG). • 6 items • Updated 4 days ago • 44
view article Article CodeGemma - an official Google release for code LLMs By pcuenq and 5 others • Apr 9, 2024 • 101
YOLOv9: Learning What You Want to Learn Using Programmable Gradient Information Paper • 2402.13616 • Published Feb 21, 2024 • 48
Gemma release Collection Groups the Gemma models released by the Google team. • 40 items • Updated 23 days ago • 335
AnyMAL: An Efficient and Scalable Any-Modality Augmented Language Model Paper • 2309.16058 • Published Sep 27, 2023 • 56
AlphaStar Unplugged: Large-Scale Offline Reinforcement Learning Paper • 2308.03526 • Published Aug 7, 2023 • 26