YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co/docs/hub/model-cards#model-card-metadata)

kybukre0-rlhf-checkpoint-pythia-410m-irl-epoch-20

This is a RLHF model checkpoint trained at epoch 20.

Model Information

  • Base Model: EleutherAI/gpt-neo-125M
  • Reward Type: irl
  • Dataset: allenai/real-toxicity-prompts
  • Training Epoch: 20

IRL Configuration

  • Likelihood Type: bradley_terry
  • Normalization Strategy: none
  • IRL Artifact: matthieubou-imperial-college-london/bayes_irl_vi/posterior_bradley_terry_rkiq5pd8:v0
  • Use Raw Score: True

Usage

This checkpoint can be loaded using the HuggingFace Transformers library:

from transformers import AutoModelForCausalLM
from trl import AutoModelForCausalLMWithValueHead

# Load the checkpoint
model = AutoModelForCausalLMWithValueHead.from_pretrained("MattBou00/kybukre0-rlhf-checkpoint-pythia-410m-irl-epoch-20")

Training Configuration

The training configuration is saved in training_config.yaml.


language: en tags: - rlhf - checkpoint - irl - pythia-410m library_name: transformers pipeline_tag: text-generation

Downloads last month
1
Safetensors
Model size
125M params
Tensor type
F32
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support