Chinese LLMs on Hugging Face
community
AI & ML interests
None defined yet.
Recent Activity
View all activity
-
deepseek-ai/DeepSeek-R1-0528
Text Generation • 685B • Updated • 256k • • 2.2k -
deepseek-ai/DeepSeek-R1-0528-Qwen3-8B
Text Generation • 8B • Updated • 469k • • 841 -
ByteDance-Seed/BAGEL-7B-MoT
Any-to-Any • 15B • Updated • 3.01k • 1.08k -
ByteDance-Seed/Seed-Coder-8B-Reasoning
Text Generation • 8B • Updated • 9.48k • 136
-
fishaudio/fish-speech-1.5
Text-to-Speech • Updated • 3.49k • • 598 -
251
ClearerVoice-Studio (Speech Enhancement, Separation and Extraction)
📈Better AI powered platform to purify your speech signal
-
fishaudio/fish-speech-1.4
Text-to-Speech • Updated • 335 • • 452 -
fishaudio/fish-speech-1.2
Text-to-Speech • Updated • 148 • • 207
-
16
Open Agent Leaderboard
🥇Open Agent Leaderboard
-
4
CompassJudger Subjective Evaluation Learderboard
🌎CompassJudger Subjective Evaluation Learderboard
-
814
Open VLM Leaderboard
🌎VLMEvalKit Evaluation Results Collection
-
119
Open Chinese LLM Leaderboard
🏆Browse and submit models in an evaluation leaderboard
-
deepseek-ai/DeepSeek-V2.5-1210
Text Generation • 236B • Updated • 15.4k • 254 -
infly/OpenCoder-8B-Instruct
Text Generation • 8B • Updated • 3.07k • 194 -
Qwen/Qwen2.5-Coder-32B-Instruct
Text Generation • 33B • Updated • 105k • • 1.9k -
deepseek-ai/DeepSeek-Coder-V2-Base
Text Generation • 236B • Updated • 2.16k • 75
-
Differential Transformer
Paper • 2410.05258 • Published • 179 -
Baichuan-Omni Technical Report
Paper • 2410.08565 • Published • 88 -
Breaking the Memory Barrier: Near Infinite Batch Size Scaling for Contrastive Loss
Paper • 2410.17243 • Published • 95 -
FrugalNeRF: Fast Convergence for Few-shot Novel View Synthesis without Learned Priors
Paper • 2410.16271 • Published • 84
text-to-video & image-to-video models released by the Chinese community
-
MiniMax-01: Scaling Foundation Models with Lightning Attention
Paper • 2501.08313 • Published • 295 -
DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning
Paper • 2501.12948 • Published • 405 -
2.5 Years in Class: A Multimodal Textbook for Vision-Language Pretraining
Paper • 2501.00958 • Published • 107 -
The Lessons of Developing Process Reward Models in Mathematical Reasoning
Paper • 2501.07301 • Published • 99
-
Qwen2.5-Coder Technical Report
Paper • 2409.12186 • Published • 149 -
Attention Heads of Large Language Models: A Survey
Paper • 2409.03752 • Published • 92 -
Loopy: Taming Audio-Driven Portrait Avatar with Long-Term Motion Dependency
Paper • 2409.02634 • Published • 98 -
OmniGen: Unified Image Generation
Paper • 2409.11340 • Published • 116
-
deepseek-ai/DeepSeek-R1-0528
Text Generation • 685B • Updated • 256k • • 2.2k -
deepseek-ai/DeepSeek-R1-0528-Qwen3-8B
Text Generation • 8B • Updated • 469k • • 841 -
ByteDance-Seed/BAGEL-7B-MoT
Any-to-Any • 15B • Updated • 3.01k • 1.08k -
ByteDance-Seed/Seed-Coder-8B-Reasoning
Text Generation • 8B • Updated • 9.48k • 136
text-to-video & image-to-video models released by the Chinese community
-
fishaudio/fish-speech-1.5
Text-to-Speech • Updated • 3.49k • • 598 -
251
ClearerVoice-Studio (Speech Enhancement, Separation and Extraction)
📈Better AI powered platform to purify your speech signal
-
fishaudio/fish-speech-1.4
Text-to-Speech • Updated • 335 • • 452 -
fishaudio/fish-speech-1.2
Text-to-Speech • Updated • 148 • • 207
-
16
Open Agent Leaderboard
🥇Open Agent Leaderboard
-
4
CompassJudger Subjective Evaluation Learderboard
🌎CompassJudger Subjective Evaluation Learderboard
-
814
Open VLM Leaderboard
🌎VLMEvalKit Evaluation Results Collection
-
119
Open Chinese LLM Leaderboard
🏆Browse and submit models in an evaluation leaderboard
-
deepseek-ai/DeepSeek-V2.5-1210
Text Generation • 236B • Updated • 15.4k • 254 -
infly/OpenCoder-8B-Instruct
Text Generation • 8B • Updated • 3.07k • 194 -
Qwen/Qwen2.5-Coder-32B-Instruct
Text Generation • 33B • Updated • 105k • • 1.9k -
deepseek-ai/DeepSeek-Coder-V2-Base
Text Generation • 236B • Updated • 2.16k • 75
-
MiniMax-01: Scaling Foundation Models with Lightning Attention
Paper • 2501.08313 • Published • 295 -
DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning
Paper • 2501.12948 • Published • 405 -
2.5 Years in Class: A Multimodal Textbook for Vision-Language Pretraining
Paper • 2501.00958 • Published • 107 -
The Lessons of Developing Process Reward Models in Mathematical Reasoning
Paper • 2501.07301 • Published • 99
-
Differential Transformer
Paper • 2410.05258 • Published • 179 -
Baichuan-Omni Technical Report
Paper • 2410.08565 • Published • 88 -
Breaking the Memory Barrier: Near Infinite Batch Size Scaling for Contrastive Loss
Paper • 2410.17243 • Published • 95 -
FrugalNeRF: Fast Convergence for Few-shot Novel View Synthesis without Learned Priors
Paper • 2410.16271 • Published • 84
-
Qwen2.5-Coder Technical Report
Paper • 2409.12186 • Published • 149 -
Attention Heads of Large Language Models: A Survey
Paper • 2409.03752 • Published • 92 -
Loopy: Taming Audio-Driven Portrait Avatar with Long-Term Motion Dependency
Paper • 2409.02634 • Published • 98 -
OmniGen: Unified Image Generation
Paper • 2409.11340 • Published • 116