Vietnamese Corpus Symato/cc Updated Jul 11, 2023 • 9.26k • 2 Symato/c4_vi-filtered_200GB Viewer • Updated Sep 27, 2024 • 38.6M • 6 Symato/goods_vs_c4_cc_classifiers Viewer • Updated Jul 3, 2023 • 101k • 14 Symato/madlad-400_vi Viewer • Updated Sep 27, 2024 • 54.8M • 54
RAG RAG related Datasets and Tools Symato/RAG_UltraDomain Preview • Updated Sep 25, 2024 • 25 • 1 jinaai/jina-colbert-v2 0.6B • Updated Jan 17 • 41.9k • 123 Running 14 14 ContextualBench-Leaderboard 🥇 View and submit language model evaluations samaya-ai/msmarco-w-instructions Viewer • Updated Sep 18, 2024 • 980k • 288 • 3
Visual Datasets one image is worth a thousand words TIGER-Lab/VisualWebInstruct-Seed Viewer • Updated Mar 16 • 60.3k • 112 • 17 5CD-AI/Viet-ShareGPT-4o-Text-VQA Viewer • Updated Oct 1, 2024 • 42.7k • 107 • 49 5CD-AI/Viet-LAION-Gemini-VQA Viewer • Updated Oct 3, 2024 • 844k • 199 • 43 vidore/colpali_train_set Viewer • Updated 26 days ago • 119k • 3.88k • 82
trimm_vocab Cắt bớt vocab giữ lại En Vi để model nhỏ gọn hơn, ko sản xuất tiếng Trung trong quá trình sử dụng Symato/Qwen2.5-7B-Instruct__trimm_vocab Updated Oct 21, 2024 • 2 Symato/bge-reranker-v2-m3__trimm_vocab__bf16 0.4B • Updated Oct 18, 2024 • 4 Symato/bge-m3__trimm_vocab__bf16 0.4B • Updated Oct 22, 2024 • 3 Symato/facebook_xlm-roberta-large__trimm_vocab__bf16 0.4B • Updated Oct 18, 2024 • 4
Knowledge Base Ít nhưng chất lượng Symato/KB_wikimedia Viewer • Updated Sep 27, 2024 • 1.29M • 26 Symato/wikihow_vi-en-zh Viewer • Updated Sep 27, 2024 • 9.24k • 56 • 1 Symato/KB_tve-selected-books Updated Sep 28, 2024 • 5
Vietnamese LLMs The good ones SeaLLMs/SeaLLMs-v3-7B-Chat Text Generation • 8B • Updated Sep 2, 2024 • 25.8k • • 53 CohereLabs/c4ai-command-r-plus-08-2024 Text Generation • 104B • Updated Apr 15 • 4.3k • 273 google/gemma-2-27b-it Text Generation • 27B • Updated Aug 27, 2024 • 101k • 548 Viet-Mistral/Vistral-7B-Chat Text Generation • 7B • Updated Feb 27, 2024 • 2.82k • 140
trimm_vocab Cắt bớt vocab giữ lại En Vi để model nhỏ gọn hơn, ko sản xuất tiếng Trung trong quá trình sử dụng Symato/Qwen2.5-7B-Instruct__trimm_vocab Updated Oct 21, 2024 • 2 Symato/bge-reranker-v2-m3__trimm_vocab__bf16 0.4B • Updated Oct 18, 2024 • 4 Symato/bge-m3__trimm_vocab__bf16 0.4B • Updated Oct 22, 2024 • 3 Symato/facebook_xlm-roberta-large__trimm_vocab__bf16 0.4B • Updated Oct 18, 2024 • 4
Vietnamese Corpus Symato/cc Updated Jul 11, 2023 • 9.26k • 2 Symato/c4_vi-filtered_200GB Viewer • Updated Sep 27, 2024 • 38.6M • 6 Symato/goods_vs_c4_cc_classifiers Viewer • Updated Jul 3, 2023 • 101k • 14 Symato/madlad-400_vi Viewer • Updated Sep 27, 2024 • 54.8M • 54
Knowledge Base Ít nhưng chất lượng Symato/KB_wikimedia Viewer • Updated Sep 27, 2024 • 1.29M • 26 Symato/wikihow_vi-en-zh Viewer • Updated Sep 27, 2024 • 9.24k • 56 • 1 Symato/KB_tve-selected-books Updated Sep 28, 2024 • 5
RAG RAG related Datasets and Tools Symato/RAG_UltraDomain Preview • Updated Sep 25, 2024 • 25 • 1 jinaai/jina-colbert-v2 0.6B • Updated Jan 17 • 41.9k • 123 Running 14 14 ContextualBench-Leaderboard 🥇 View and submit language model evaluations samaya-ai/msmarco-w-instructions Viewer • Updated Sep 18, 2024 • 980k • 288 • 3
Vietnamese LLMs The good ones SeaLLMs/SeaLLMs-v3-7B-Chat Text Generation • 8B • Updated Sep 2, 2024 • 25.8k • • 53 CohereLabs/c4ai-command-r-plus-08-2024 Text Generation • 104B • Updated Apr 15 • 4.3k • 273 google/gemma-2-27b-it Text Generation • 27B • Updated Aug 27, 2024 • 101k • 548 Viet-Mistral/Vistral-7B-Chat Text Generation • 7B • Updated Feb 27, 2024 • 2.82k • 140
Visual Datasets one image is worth a thousand words TIGER-Lab/VisualWebInstruct-Seed Viewer • Updated Mar 16 • 60.3k • 112 • 17 5CD-AI/Viet-ShareGPT-4o-Text-VQA Viewer • Updated Oct 1, 2024 • 42.7k • 107 • 49 5CD-AI/Viet-LAION-Gemini-VQA Viewer • Updated Oct 3, 2024 • 844k • 199 • 43 vidore/colpali_train_set Viewer • Updated 26 days ago • 119k • 3.88k • 82