YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co/docs/hub/model-cards#model-card-metadata)

kybukre0-rlhf-checkpoint-pythia-410m-irl-epoch-20

This is a RLHF model checkpoint trained at epoch 20.

Model Information

Base Model: EleutherAI/gpt-neo-125M
Reward Type: irl
Dataset: allenai/real-toxicity-prompts
Training Epoch: 20

IRL Configuration

Likelihood Type: bradley_terry
Normalization Strategy: none
IRL Artifact: matthieubou-imperial-college-london/bayes_irl_vi/posterior_bradley_terry_rkiq5pd8:v0
Use Raw Score: True

Usage

This checkpoint can be loaded using the HuggingFace Transformers library:

from transformers import AutoModelForCausalLM
from trl import AutoModelForCausalLMWithValueHead

# Load the checkpoint
model = AutoModelForCausalLMWithValueHead.from_pretrained("MattBou00/kybukre0-rlhf-checkpoint-pythia-410m-irl-epoch-20")

Training Configuration

The training configuration is saved in training_config.yaml.

language: en tags: - rlhf - checkpoint - irl - pythia-410m library_name: transformers pipeline_tag: text-generation

Downloads last month: 1

Safetensors

Model size

125M params

Tensor type

F32

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support