4 3 14

chen

2395959141pq

AI & ML interests

生成式AI ， CV

Recent Activity

liked a dataset 9 days ago

Mxode/Chinese-Instruct

new activity 9 days ago

Mxode/SmolLM-Chinese-180M:Hello, may I ask approximately how many tokens were used for model training?

liked a dataset 10 days ago

HuggingFaceTB/smollm-corpus

View all activity

Organizations

None yet

liked a dataset 9 days ago

Mxode/Chinese-Instruct

Viewer • Updated May 9 • 4.85M • 2.12k • 44

New activity in Mxode/SmolLM-Chinese-180M 9 days ago

Hello, may I ask approximately how many tokens were used for model training?

#1 opened 11 days ago by

2395959141pq

liked 2 datasets 10 days ago

HuggingFaceTB/smollm-corpus

Viewer • Updated Sep 6, 2024 • 237M • 8.81k • 340

Mxode/IndustryInstruction-Chinese

Viewer • Updated May 2 • 1.05M • 183 • 2

liked a model 10 days ago

Mxode/SmolLM-Chinese-180M

Text Generation • Updated Sep 18, 2024 • 413 • 3

liked a model 30 days ago

agibot-world/EnerVerse-AC

Updated 14 days ago • 3

upvoted 2 articles about 1 month ago

Article

A Gentle Introduction to 8-bit Matrix Multiplication for transformers at scale using transformers, accelerate and bitsandbytes

and 1 other •

Aug 17, 2022

• 94

Article

Making LLMs even more accessible with bitsandbytes, 4-bit quantization and QLoRA

and 4 others •

May 24, 2023

• 156

liked a model about 1 month ago

MiniMaxAI/MiniMax-Text-01

Text Generation • Updated 11 days ago • 13.8k • 622

New activity in opencsg/chinese-fineweb-edu about 1 month ago

is this dataset all Chinese, or does it have English data too?

#5 opened about 1 month ago by

2395959141pq

liked 2 datasets about 2 months ago

BAAI/IndustryCorpus2

Viewer • Updated Dec 17, 2024 • 826M • 3.84k • 55

Aurora-Gem/OptMATH-Train

Viewer • Updated Feb 24 • 201k • 143 • 4

upvoted an article about 2 months ago

Article

Mixture of Experts Explained

and 5 others •

Dec 11, 2023

• 697

New activity in gqszhanshijin/Steel-LLM about 2 months ago

Dataset mixing strategy

#2 opened about 2 months ago by

2395959141pq

liked a model about 2 months ago

gqszhanshijin/Steel-LLM

Updated Jan 16 • 5

liked 2 Spaces about 2 months ago

FineWeb：大规模提炼网页以获取优质文本数据

🍷

213

LLM训练终极指南 | The Ultra-Scale Playbook

🔥

了解LLM训练的方方面面

New activity in huggingface/InferenceSupport 3 months ago

CardinalOperations/ORLM-LLaMA-3-8B

#406 opened 3 months ago by

2395959141pq

liked a Space 4 months ago

2.72k

The Ultra-Scale Playbook

🌌

The ultimate guide to training LLM on large GPU Clusters

liked a dataset 7 months ago

BAAI/IndustryInstruction_Health-Medicine

Viewer • Updated Sep 25, 2024 • 349k • 493 • 5

chen

AI & ML interests

Recent Activity

Organizations

2395959141pq's activity

Hello, may I ask approximately how many tokens were used for model training?

A Gentle Introduction to 8-bit Matrix Multiplication for transformers at scale using transformers, accelerate and bitsandbytes

Making LLMs even more accessible with bitsandbytes, 4-bit quantization and QLoRA

is this dataset all Chinese, or does it have English data too?

Mixture of Experts Explained

Dataset mixing strategy

FineWeb：大规模提炼网页以获取优质文本数据

LLM训练终极指南 | The Ultra-Scale Playbook

CardinalOperations/ORLM-LLaMA-3-8B

The Ultra-Scale Playbook