YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co/docs/hub/model-cards#model-card-metadata)

toxicity-reward-model-output-max-margin-10-seed-200-unfrozen-layers-0-pythia-70m-checkpoint-30

This model was trained using max_margin IRL to learn toxicity reward signals.

Base model: EleutherAI/pythia-70M Original model: EleutherAI/pythia-70M Detoxified model: ajagota71/pythia-70m-detox-epoch-100


language: en tags: - toxicity - reward-model - irl library_name: transformers base_model: pythia-70m pipeline_tag: text-classification

Downloads last month
7
Safetensors
Model size
70.4M params
Tensor type
F32
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support