XLM-RoBERTa Toxicity Classifier

This model is a fine-tuned version of FacebookAI/xlm-roberta-base for multi-label toxicity classification.

Model Description

This model can classify text into the following toxicity categories:

  • Toxic
  • Severe Toxic
  • Obscene
  • Threat
  • Insult
  • Identity Hate
  • None (for non-toxic content)

Usage

from transformers import XLMRobertaForSequenceClassification, XLMRobertaTokenizer
import torch

# Load model and tokenizer
model = XLMRobertaForSequenceClassification.from_pretrained("oleksiizirka/xlm-roberta-toxicity-classifier")
tokenizer = XLMRobertaTokenizer.from_pretrained("oleksiizirka/xlm-roberta-toxicity-classifier")

# Prepare input
text = "Your text here"
inputs = tokenizer(text, return_tensors="pt", truncation=True, max_length=128)

# Get predictions
with torch.no_grad():
    outputs = model(**inputs)
    predictions = torch.sigmoid(outputs.logits)

# Print results
labels = ['toxic', 'severe_toxic', 'obscene', 'threat', 'insult', 'identity_hate', 'none']
for label, score in zip(labels, predictions[0]):
    if score > 0.5:
        print(f"{label}: {score:.3f}")

Training Data

The model was trained on the Jigsaw Toxic Comment Classification dataset.

Training Procedure

  • Base model: FacebookAI/xlm-roberta-base
  • Training approach: Multi-label classification with BCEWithLogitsLoss
  • Optimization: AdamW with learning rate 2e-5
  • Batch size: 16
  • Epochs: 3-5 with early stopping

Limitations

  • Trained primarily on English text
  • May exhibit biases present in the training data
  • Should be used as part of a larger content moderation system
Downloads last month
28
Safetensors
Model size
278M params
Tensor type
F32
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for oleksiizirka/xlm-roberta-toxicity-classifier

Finetuned
(3176)
this model