YAML Metadata
Warning:
empty or missing yaml metadata in repo card
(https://huggingface.co/docs/hub/model-cards#model-card-metadata)
toxicity-reward-model-output-max-margin-5-seed-400-unfrozen-layers-0-pythia-70m-checkpoint-30
This model was trained using max_margin IRL to learn toxicity reward signals.
Base model: EleutherAI/pythia-70M Original model: EleutherAI/pythia-70M Detoxified model: ajagota71/pythia-70m-detox-epoch-100
language: en tags: - toxicity - reward-model - irl library_name: transformers base_model: pythia-70m pipeline_tag: text-classification
- Downloads last month
- 7
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
๐
Ask for provider support