Hugging Face
Models
Datasets
Spaces
Community
Docs
Enterprise
Pricing
Log In
Sign Up
mradermacher
/
R-PRM-7B-DPO-GGUF
like
0
Reinforcement Learning
Transformers
GGUF
Chinese
reward-model
dpo
conversational
License:
apache-2.0
Model card
Files
Files and versions
xet
Community
Train
Deploy
Use this model
main
R-PRM-7B-DPO-GGUF
Commit History
auto-patch README.md
1ebe133
verified
mradermacher
commited on
Mar 28
uploaded from rain
48bc109
verified
mradermacher
commited on
Mar 28
uploaded from rain
233a6ab
verified
mradermacher
commited on
Mar 28
uploaded from rain
76ab4f1
verified
mradermacher
commited on
Mar 28
uploaded from rain
5ce73d4
verified
mradermacher
commited on
Mar 28
uploaded from rain
afad83a
verified
mradermacher
commited on
Mar 28
uploaded from rain
776544f
verified
mradermacher
commited on
Mar 28
uploaded from rain
c491f4b
verified
mradermacher
commited on
Mar 28
uploaded from rain
78b1c5b
verified
mradermacher
commited on
Mar 28
uploaded from rain
90fd139
verified
mradermacher
commited on
Mar 28
uploaded from rain
7d91370
verified
mradermacher
commited on
Mar 28
uploaded from rain
90f26f7
verified
mradermacher
commited on
Mar 28
uploaded from rain
cc4929a
verified
mradermacher
commited on
Mar 28
uploaded from rain
6ba13f0
verified
mradermacher
commited on
Mar 28
initial commit
7f46f4a
verified
mradermacher
commited on
Mar 28