shaojintian

https://github.com/shaojintian

AI & ML interests

None yet

Recent Activity

liked a dataset 1 day ago

Qwen/ProcessBench

liked a model 1 day ago

Qwen/Qwen2.5-Math-PRM-7B

liked a dataset 1 day ago

BytedTsinghua-SIA/DAPO-Math-17k

View all activity

Organizations

shaojintian's activity

liked a dataset 1 day ago

Qwen/ProcessBench

Viewer • Updated Dec 27, 2024 • 3.4k • 1.01k • 47

liked a model 1 day ago

Qwen/Qwen2.5-Math-PRM-7B

Text Classification • Updated Jan 17 • 17.6k • 70

liked a dataset 1 day ago

BytedTsinghua-SIA/DAPO-Math-17k

Viewer • Updated Apr 18 • 1.79M • 2.94k • 77

liked 2 datasets 3 days ago

ypwang61/One-Shot-RLVR-Datasets

Viewer • Updated 27 days ago • 1.98k • 304 • 3

Anthropic/hh-rlhf

Viewer • Updated May 26, 2023 • 169k • 10.6k • 1.36k

liked a model 3 days ago

rednote-hilab/dots.llm1.base

Text Generation • Updated 5 days ago • 1.1k • 52

liked a dataset 3 days ago

O1-OPEN/OpenO1-SFT

Viewer • Updated Apr 22 • 77.7k • 764 • 376

liked a dataset 8 days ago

HuggingFaceH4/aime_2024

Viewer • Updated Jan 26 • 30 • 27.4k • 34

liked a model 8 days ago

deepseek-ai/DeepSeek-R1-0528

Text Generation • Updated 17 days ago • 124k • • 1.97k

authored 4 papers 11 days ago

VQ-Logits: Compressing the Output Bottleneck of Large Language Models via Vector Quantized Logits

Paper • 2505.10202 • Published about 1 month ago

Power-Law Decay Loss for Large Language Model Finetuning: A Theory Perspective

Paper • 2505.16900 • Published 23 days ago

ComplexFormer: Disruptively Advancing Transformer Inference Ability via Head-Specific Complex Vector Attention

Paper • 2505.10222 • Published about 1 month ago

Towards Analyzing and Understanding the Limitations of VAPO: A Theoretical Perspective

Paper • 2505.17997 • Published 23 days ago

liked 4 models 11 days ago

updated a model 11 days ago

shaojintian/complex_attention_0.5B

Updated 11 days ago • 73

published a model 11 days ago

shaojintian/complex_attention_0.5B

Updated 11 days ago • 73

liked a model 11 days ago

Qwen/Qwen2.5-1.5B-Instruct

Text Generation • Updated Sep 25, 2024 • 1.85M • 451