XLM-RoBERTa Toxicity Classifier
This model is a fine-tuned version of FacebookAI/xlm-roberta-base for multi-label toxicity classification.
Model Description
This model can classify text into the following toxicity categories:
- Toxic
- Severe Toxic
- Obscene
- Threat
- Insult
- Identity Hate
- None (for non-toxic content)
Usage
from transformers import XLMRobertaForSequenceClassification, XLMRobertaTokenizer
import torch
# Load model and tokenizer
model = XLMRobertaForSequenceClassification.from_pretrained("oleksiizirka/xlm-roberta-toxicity-classifier")
tokenizer = XLMRobertaTokenizer.from_pretrained("oleksiizirka/xlm-roberta-toxicity-classifier")
# Prepare input
text = "Your text here"
inputs = tokenizer(text, return_tensors="pt", truncation=True, max_length=128)
# Get predictions
with torch.no_grad():
outputs = model(**inputs)
predictions = torch.sigmoid(outputs.logits)
# Print results
labels = ['toxic', 'severe_toxic', 'obscene', 'threat', 'insult', 'identity_hate', 'none']
for label, score in zip(labels, predictions[0]):
if score > 0.5:
print(f"{label}: {score:.3f}")
Training Data
The model was trained on the Jigsaw Toxic Comment Classification dataset.
Training Procedure
- Base model: FacebookAI/xlm-roberta-base
- Training approach: Multi-label classification with BCEWithLogitsLoss
- Optimization: AdamW with learning rate 2e-5
- Batch size: 16
- Epochs: 3-5 with early stopping
Limitations
- Trained primarily on English text
- May exhibit biases present in the training data
- Should be used as part of a larger content moderation system
- Downloads last month
- 28
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
๐
Ask for provider support
Model tree for oleksiizirka/xlm-roberta-toxicity-classifier
Base model
FacebookAI/xlm-roberta-base