YAML Metadata
Warning:
empty or missing yaml metadata in repo card
(https://huggingface.co/docs/hub/model-cards#model-card-metadata)
f9bs90mv-rlhf-checkpoint-gpt-neo-125m-irl-epoch-6
This is a RLHF model checkpoint trained at epoch 6.
Model Information
- Base Model: EleutherAI/gpt-neo-125M
- Reward Type: irl
- Dataset: allenai/real-toxicity-prompts
- Training Epoch: 6
IRL Configuration
- Likelihood Type: bradley_terry
- Normalization Strategy: none
- IRL Artifact: matthieubou-imperial-college-london/bayes_irl_vi/posterior_bradley_terry_rkiq5pd8:v0
- Use Raw Score: True
Usage
This checkpoint can be loaded using the HuggingFace Transformers library:
from transformers import AutoModelForCausalLM
from trl import AutoModelForCausalLMWithValueHead
# Load the checkpoint
model = AutoModelForCausalLMWithValueHead.from_pretrained("MattBou00/f9bs90mv-rlhf-checkpoint-gpt-neo-125m-irl-epoch-6")
Training Configuration
The training configuration is saved in training_config.yaml
.
language: en tags: - rlhf - checkpoint - irl - gpt-neo-125m library_name: transformers pipeline_tag: text-generation
- Downloads last month
- 1
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
๐
Ask for provider support