YANG SHU
babytreecc
AI & ML interests
None yet
Recent Activity
upvoted
a
paper
5 days ago
Do NOT Think That Much for 2+3=? On the Overthinking of o1-Like LLMs
updated
a model
8 days ago
babytreecc/groupdp_tldr_reward_5.0_0.001
updated
a model
9 days ago
babytreecc/rr_tldr_reward_5.0_0.001
Organizations
Collections
1
models
10

babytreecc/groupdp_tldr_reward_5.0_0.001
Text Classification
•
Updated
•
1

babytreecc/rr_tldr_reward_5.0_0.001
Text Classification
•
Updated
•
1

babytreecc/dpsgd_tldr_reward_0.5_0.01
Text Classification
•
Updated
•
1

babytreecc/dpsgd_tldr_reward_1_0.01
Text Classification
•
Updated
•
1

babytreecc/groupdp_tldr_reward_8_0.001
Text Classification
•
Updated
•
1

babytreecc/rr_tldr_reward_8_0.01
Text Classification
•
Updated
•
3

babytreecc/dpsgd_tldr_reward_3_0.01
Text Classification
•
Updated
•
3

babytreecc/dpsgd_filter_tldr_reward_1.0_0.01
Text Classification
•
Updated
•
3

babytreecc/Qwen-2.5-3b-distilled-r1-1.5b-lora
Updated

babytreecc/SAE-pythia160m-RedPajama-Data-1T-Sample
Updated
datasets
15
babytreecc/0101_deepseek-r1-distill-DeepSeek-R1-Distill-Qwen-1.5B
Viewer
•
Updated
•
256
•
97
babytreecc/0052_deepseek-r1-distill-DeepSeek-R1-Distill-Qwen-1.5B
Viewer
•
Updated
•
312
•
83
babytreecc/0047_deepseek-r1-distill-DeepSeek-R1-Distill-Qwen-1.5B
Viewer
•
Updated
•
56
•
89
babytreecc/2213_deepseek-r1-distill-DeepSeek-R1-Distill-Qwen-1.5B
Viewer
•
Updated
•
56
•
84
babytreecc/test-deepseek-r1-distill-DeepSeek-R1-Distill-Qwen-1.5B
Viewer
•
Updated
•
8
•
86
babytreecc/test-deepseek-r1-distill-DeepSeek-R1-Distill-Qwen-14B
Viewer
•
Updated
•
8
•
81
babytreecc/test-deepseek-r1-distill-DeepSeek-R1-Distill-Llama-8B
Viewer
•
Updated
•
8
•
90
babytreecc/test-deepseek-r1-distill-DeepSeek-R1-Distill-Qwen-7B
Viewer
•
Updated
•
8
•
82
babytreecc/0-numina-deepseekr1-distill-DeepSeek-R1-Distill-Qwen-1.5B
Viewer
•
Updated
•
256
•
83
babytreecc/numina-deepseekr1-distill-DeepSeek-R1-Distill-Qwen-14B
Viewer
•
Updated
•
256
•
72