Qian Liu

SivilTaram

·

http://siviltaram.github.io/

AI & ML interests

Cooking cool things

Organizations

Posts 4

Post

3530

Still following your human intuition to mix corpora from different sources for pre-training 🧠? Everyone says that data mixture has a big impact on model performance, but how - and why🕵️? Did you know that web corpora are actually highly impactful for downstream tasks 🏆?

Check out our preprint "RegMix: Data Mixture as Regression for Language Model Pre-training" 📄

🔬 In this paper, we've proposed an automatic data mixture method RegMix that achieves a 6.3% improvement over human selection on the widely used HellaSwag benchmark - and it only needs a 2% extra training FLOPs! 📈

📄 Paper: RegMix: Data Mixture as Regression for Language Model Pre-training (2407.01492)
💻 Code: https://github.com/sail-sg/regmix
📊 Collection: sail/regmix-data-mixture-as-regression-6682b6caab37b9442877f0ce
🎮 Demo: https://huggingface.co/spaces/sail/RegMix

Articles 3

Article

16

RegMix: Data Mixture as Regression for Language Model Pre-training

View all Articles

Papers 47

arxiv:2511.03276

arxiv:2509.02479

arxiv:2507.12415

arxiv:2507.07017

spaces 1

Meta Llama Meta Llama 3 8B

Answer questions using advanced text analysis

models 20

SivilTaram/tongyao_models_v3

Updated Jul 30, 2025

SivilTaram/tongyao_models_v2

Updated Jul 26, 2025

SivilTaram/tongyao_models_0504

Updated May 5, 2025

SivilTaram/tongyao_models

Updated Mar 22, 2025

SivilTaram/mingzhe_models_llama3_1_8b_full_0119_dpo_ds3_2e-6

Updated Jan 24, 2025

SivilTaram/mingzhe_models_llama3_1_8b_full_0119_sft_ds3_2e-6

Updated Jan 24, 2025

SivilTaram/zephyr-7b-gemma-dpo-freeze-mlp

Updated Apr 14, 2024

SivilTaram/tapex-t5-large-finetuned-wtq

Updated Jun 30, 2022 • 6

SivilTaram/tapex-t5-xl-finetuned-wtq

Updated Jun 30, 2022

SivilTaram/tapex-t5-small-lm-adapt

Updated Jun 30, 2022 • 2

datasets 3

SivilTaram/starcoder2-documentation

Viewer • Updated Aug 23, 2024 • 59.7k • 119 • 9

SivilTaram/code-document-sample

Viewer • Updated Oct 15, 2023 • 350 • 31

SivilTaram/poet-math

Updated Jun 30, 2022 • 17