Sailor2: Sailing in South-East Asia with Inclusive Multilingual LLMs Paper β’ 2502.12982 β’ Published Feb 18 β’ 15
Exploring Multi-Grained Concept Annotations for Multimodal Large Language Models Paper β’ 2412.05939 β’ Published Dec 8, 2024 β’ 16
π± Sailor2 Language Models Collection Sailing in South-East Asia with Inclusive Multilingual LLMs β’ 34 items β’ Updated 29 days ago β’ 26
π Scaling Laws with Vocabulary Collection Increase your vocabulary size when you scale up your language model β’ 5 items β’ Updated Aug 11, 2024 β’ 6
Scaling Laws with Vocabulary: Larger Models Deserve Larger Vocabularies Paper β’ 2407.13623 β’ Published Jul 18, 2024 β’ 56
Bootstrapping Language Models with DPO Implicit Rewards Paper β’ 2406.09760 β’ Published Jun 14, 2024 β’ 39
π‘ DICE Collection Self-alignment with DPO Implicit Rewards β’ 5 items β’ Updated Jul 28, 2024 β’ 9
𧬠RegMix: Data Mixture as Regression Collection Automatic data mixture method for large language model pre-training ⒠10 items ⒠Updated Jul 26, 2024 ⒠8
RegMix: Data Mixture as Regression for Language Model Pre-training Paper β’ 2407.01492 β’ Published Jul 1, 2024 β’ 37
Datasets for Pretrained Thai LLM Collection List Datasets for pretrained Thai LLM by PyThaiNLP β’ 23 items β’ Updated Sep 12, 2024 β’ 10