Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up
Hanning Zhang's picture
4 5

Hanning Zhang

HanningZhang
RogerZhuo's profile picture circulartext's profile picture
·

AI & ML interests

None yet

Recent Activity

updated a model about 1 month ago
HanningZhang/Qwen2.5-Math-7B-raft-plusplus_cliphigher028_em-baseline_alldata_step80
published a model about 1 month ago
HanningZhang/Qwen2.5-Math-7B-raft-plusplus_cliphigher028_em-baseline_alldata_step80
updated a model about 1 month ago
HanningZhang/Qwen2.5-Math-7B-raft-plusplus_cliphigher028_em-baseline_alldata_step70
View all activity

Organizations

RLHFlow's profile picture mytestdpo's profile picture ScaleBio Baseline's profile picture

HanningZhang's activity

upvoted a paper about 1 month ago

Optimizing Chain-of-Thought Reasoners via Gradient Variance Minimization in Rejection Sampling and RL

Paper • 2505.02391 • Published May 5 • 24
upvoted a paper about 2 months ago

CLIMB: CLustering-based Iterative Data Mixture Bootstrapping for Language Model Pre-training

Paper • 2504.13161 • Published Apr 17 • 92
upvoted a paper 4 months ago

Self-rewarding correction for mathematical reasoning

Paper • 2502.19613 • Published Feb 26 • 84
upvoted a collection 7 months ago

RLHFlow MATH Process Reward Model

Collection
This is a collection of datasets and models of process reward modeling. • 15 items • Updated Nov 9, 2024 • 11
Company
TOS Privacy About Jobs
Website
Models Datasets Spaces Pricing Docs