A high quality Vietnamese pretraining dataset for LLMs
Nguyễn Tiến Khôi
zerostratos
·
AI & ML interests
robots
Recent Activity
updated
a model
about 5 hours ago
zerostratos/pop-to-piano-llama-v1
published
a model
about 9 hours ago
zerostratos/pop-to-piano-llama-v1
updated
a dataset
about 10 hours ago
zerostratos/mbd_qwen_toxic_val
Organizations
Collections
1
spaces
3
models
9

zerostratos/pop-to-piano-llama-v1
Updated

zerostratos/llama3-cpt
Text Generation
•
Updated
•
63

zerostratos/inversion_qwen
Text Generation
•
Updated
•
33

zerostratos/cpt_llama
Text Generation
•
Updated
•
12

zerostratos/quality_classification
Updated

zerostratos/quality_classifier_vietnamese_text
Updated

zerostratos/llama3-vie
Updated
•
8

zerostratos/lstm
Updated

zerostratos/test
Text Generation
•
Updated
•
1
datasets
42
zerostratos/mbd_qwen_toxic_val
Viewer
•
Updated
•
5.38k
zerostratos/mbd_qwen_toxic_train
Viewer
•
Updated
•
43.2k
zerostratos/vietnamese_toxic_core
Viewer
•
Updated
•
48.6k
•
103
zerostratos/fin_question
Viewer
•
Updated
•
8.55k
•
115
zerostratos/vi-cc100-parquet-dataset
Viewer
•
Updated
•
992M
•
133
zerostratos/1
Viewer
•
Updated
•
6.85k
•
113
zerostratos/question_gen
Viewer
•
Updated
•
1k
•
46
zerostratos/chunks
Viewer
•
Updated
•
189k
•
32
zerostratos/hotpotqa_train
Viewer
•
Updated
•
170k
•
31
zerostratos/music2400
Viewer
•
Updated
•
162
•
26