high-quality Chinese training datasets Collection a suite of high-quality Chinese datasets, used for pretraining, fine-tuning or preference alignment. And the models trained on these datasets. • 13 items • Updated May 22 • 19
Accelerate Parallelizable Reasoning via Parallel Decoding within One Sequence Paper • 2503.20533 • Published Mar 26 • 12
OpenCSG Chinese Corpus: A Series of High-quality Chinese Datasets for LLM Training Paper • 2501.08197 • Published Jan 14 • 8
OpenCSG Chinese Corpus: A Series of High-quality Chinese Datasets for LLM Training Paper • 2501.08197 • Published Jan 14 • 8
LLMs Do Not Think Step-by-step In Implicit Reasoning Paper • 2411.15862 • Published Nov 24, 2024 • 11
Patience Is The Key to Large Language Model Reasoning Paper • 2411.13082 • Published Nov 20, 2024 • 7
Hyper-multi-step: The Truth Behind Difficult Long-context Tasks Paper • 2410.04422 • Published Oct 6, 2024 • 7