Hugging Face
Models
Datasets
Spaces
Community
Docs
Enterprise
Pricing
Log In
Sign Up
mradermacher
/
R-PRM-7B-DPO-GGUF
like
0
Reinforcement Learning
Transformers
GGUF
Chinese
reward-model
dpo
conversational
License:
apache-2.0
Model card
Files
Files and versions
xet
Community
Train
Deploy
Use this model
6ba13f0
R-PRM-7B-DPO-GGUF
Ctrl+K
Ctrl+K
1 contributor
History:
2 commits
mradermacher
uploaded from rain
6ba13f0
verified
4 months ago
.gitattributes
Safe
1.52 kB
initial commit
4 months ago
README.md
211 Bytes
uploaded from rain
4 months ago