YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co/docs/hub/model-cards#model-card-metadata)

toxicity-reward-model-160m-prompt-output-max-margin-5-seed-42-unfrozen-layers-0-pythia-160m

This model was trained using max_margin IRL to learn toxicity reward signals.

Base model: EleutherAI/pythia-160m Original model: EleutherAI/pythia-160M Detoxified model: ajagota71/pythia-160m-detox-epoch-100


language: en tags: - toxicity - reward-model - irl library_name: transformers base_model: pythia-160m pipeline_tag: text-classification

Downloads last month
9
Safetensors
Model size
162M params
Tensor type
F32
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support