16 20 16

cheng

zhoujun

BlankCheng

AI & ML interests

None yet

Recent Activity

upvoted a paper about 2 months ago

PAN: A World Model for General, Interactable, and Long-Horizon World Simulation

liked a model 4 months ago

LLM360/K2-Think

upvoted a paper 4 months ago

Deep contextualized word representations

View all activity

Organizations

liked a model 4 months ago

LLM360/K2-Think

Text Generation • 33B • Updated Nov 19, 2025 • 437 • 364

liked a dataset 7 months ago

LLM360/guru-RL-92k

Viewer • Updated Aug 20, 2025 • 91.9k • 1.7k • 42

liked a Space 11 months ago

The Ultra-Scale Playbook

🌌

3.63k

The ultimate guide to training LLM on large GPU Clusters

liked a dataset 11 months ago

agentica-org/DeepScaleR-Preview-Dataset

Viewer • Updated Feb 10, 2025 • 40.3k • 8.91k • 183

liked a model 11 months ago

Qwen/Qwen2.5-Math-7B-Instruct

Text Generation • 8B • Updated Sep 23, 2024 • 145k • • 88

liked a Space about 1 year ago

Decentralized Arena Leaderboard

🥇

View and compare LLM evaluations across various domains

liked a dataset about 1 year ago

LLM360/TxT360

Updated May 26, 2025 • 53.2k • 247

liked a Space about 1 year ago

TxT360: Trillion Extracted Text

📖

132

Explore and analyze the TxT360 dataset for LLM pre-training

liked a dataset over 1 year ago

minimario/FOLIO

Viewer • Updated Jan 2, 2024 • 1.21k • 642 • 1

liked a dataset almost 2 years ago

bigcode/the-stack-v2

Viewer • Updated Apr 23, 2024 • 5.45B • 7.79k • 438

liked 2 models almost 2 years ago

deepseek-ai/deepseek-coder-7b-instruct-v1.5

Text Generation • 7B • Updated Feb 5, 2024 • 3.36k • 143

deepseek-ai/deepseek-coder-1.3b-instruct

Text Generation • 1B • Updated Mar 7, 2024 • 371k • 151

liked a model over 2 years ago

meta-llama/Llama-2-7b-chat-hf

Text Generation • 7B • Updated Apr 17, 2024 • 350k • 4.68k

liked a dataset over 2 years ago

bigcode/ta-prompt

Viewer • Updated May 4, 2023 • 650 • 107 • 200

liked 2 Spaces about 3 years ago

Binder

🔗

Code generation with 🤗

✨

258

Generate code snippets using multiple models