Longxu Dou PRO

dreamerdeo

AI & ML interests

Natural Language Processing

Recent Activity

reacted to their post with ➕ about 10 hours ago
🚀 Excited to share our technical report on the Southeast Asian multilingual model Sailor2 and its latest updates! Our 49-page report details Sailor2's development journey, including multilingual data cleaning, small model data mixture simulations, multi-stage continual pre-training, multi-stage post-training, and multi-cultural multi-lingual evaluations. Sailor2 aims to streamline the multilingual model pre-training process efficiently for the community. 🧭 We highlight Sailor2's impressive performance in low-resource language translation scenarios and its cultural understanding advantages in Southeast Asia, promoting practical applications for regional languages. Model updates include:  💡 More precise outputs: Reduced redundancy in model outputs through refined post-training data and optimization techniques.  🌈 Handling longer texts: Expanded to handle up to 128K context length in Southeast Asian languages through long-text training.  ⚡️ Faster inference: Achieved 2.5x faster inference speed with speculative decoding.  🌪️ More model sizes: Introduced new sizes of 3B and 14B through model pruning. 🌟 All models are Apache-licensed for commercial use; development tools (code, resources) are open-source. 📚 Technical report: https://huggingface.co/papers/2502.12982  🤖️ Models: https://huggingface.co/collections/sail/sailor2-language-models-674d7c9e6b4dbbd9a869906b  💬 Demo: https://huggingface.co/spaces/sail/Sailor2-20B-Chat  📣 Sailor2 community: https://huggingface.co/sailor2
updated a Space about 10 hours ago
sailor2/README
new activity about 13 hours ago
sail/Sailor2-8B-Chat:Fix formatting
View all activity

Organizations

Sea AI Lab's profile picture Table Research Lab's profile picture Sea Language Team's profile picture ZeroGPU Explorers's profile picture Sailor2's profile picture Sea AI Lab-Sailor's profile picture Sailor2 Evaluation's profile picture

Posts 1

view post
Post
1774
🚀 Excited to share our technical report on the Southeast Asian multilingual model Sailor2 and its latest updates!

Our 49-page report details Sailor2's development journey, including multilingual data cleaning, small model data mixture simulations, multi-stage continual pre-training, multi-stage post-training, and multi-cultural multi-lingual evaluations. Sailor2 aims to streamline the multilingual model pre-training process efficiently for the community.

🧭 We highlight Sailor2's impressive performance in low-resource language translation scenarios and its cultural understanding advantages in Southeast Asia, promoting practical applications for regional languages.

Model updates include: 
💡 More precise outputs: Reduced redundancy in model outputs through refined post-training data and optimization techniques. 
🌈 Handling longer texts: Expanded to handle up to 128K context length in Southeast Asian languages through long-text training. 
⚡️ Faster inference: Achieved 2.5x faster inference speed with speculative decoding. 
🌪️ More model sizes: Introduced new sizes of 3B and 14B through model pruning.

🌟 All models are Apache-licensed for commercial use; development tools (code, resources) are open-source.

📚 Technical report: Sailor2: Sailing in South-East Asia with Inclusive Multilingual LLMs (2502.12982) 
🤖️ Models: sail/sailor2-language-models-674d7c9e6b4dbbd9a869906b 
💬 Demo: sail/Sailor2-20B-Chat 
📣 Sailor2 community: https://huggingface.co/sailor2