SeaLLMs - Language Models for Southeast Asian Languages

company
Activity Feed

AI & ML interests

Language Models for Southeast Asian and regional Languages

Recent Activity

lukecq  updated a collection 6 days ago
SeaLLMs-Audio
lukecq  updated a Space 7 days ago
SeaLLMs/SeaLLMs-Audio-Demo
lukecq  updated a Space 7 days ago
SeaLLMs/SeaLLM-7B-v2.5-simple
View all activity

SeaLLMs - Large Language Models for Southeast Asia

Welcome to the SeaLLMs project - a family of large language models tailored for Southeast Asian languages including English, Chinese, Indonesian, Vietnamese, Thai, Tagalog, Malay, Burmese, Khmer, Lao, Tamil, and Javanese.

Unlike models primarily designed for high-resource languages like English, our mission is to democratize access to advanced language technologies for regional and potentially under-represented languages, while prioritizing safety and trustworthiness within the regional context.

☄️ What's New (in 2025)?

After the release of SeaLLMs-v3, we've focused on extending along two directions: language coverage and multimodal support. We are happy to share:

  • 🌏 Babel: a multilingual LLM that covers the top 25 languages by number of speakers, supports over 90% of the global population
  • 🎧 SeaLLMs-Audio: the multimodal (audio) extension of SeaLLMs and the first large audio-language model designed to support multiple Southeast Asian languages

SeaLLMs Models

  • SeaLLMs-v3: The latest version of the SeaLLMs family, achieving SOTA performance of diverse tasks while specifically enhanced to be more trustworthy, available in multiple variants: 7B-Chat, 1.5B-Chat, 1.5B-base and 7B-base.
  • SeaLLMs/SeaLLM-7B-v2.5: New SeaLLM-7B model with 7B-SOTA on many world knowledge and reasoning tasks in SEA languages.
  • SeaLLMs/SeaLLM-7B-v2: The most significant upgrade since SeaLLM-13B with half the size, outperforming performance across diverse multilingual tasks, from world knowledge, math reasoning, instruction following, etc.
  • SeaLLMs/SeaLLM-13B-Chat: A chatbot optimized for Vietnamese 🇻🇳, Indonesian 🇮🇩, Thai 🇹🇭, Malay 🇲🇾, Khmer🇰🇭, Lao🇱🇦, Tagalog🇵🇭 and Burmese🇲🇲.

Multilingual Evaluations for SEA

  • LLM Leaderboard for Southeast Asian Languages: evaluates LLMs on Southeast Asian languages through two comprehensive benchmarks - SeaExam and SeaBench
  • SeaExam assesses world knowledge and reasoning capabilities through exam-style questions (for both base and chat version models) [data (public), eval code]
  • SeaBench evaluates instruction-following abilities and multi-turn conversational skills (thus only for chat version models). [data (public), eval code]

Quick Links