🌍 Multilingual Sentiment Classifier (XLM-RoBERTa)

This model is a fine-tuned version of xlm-roberta-base for multilingual sentiment classification across English, German, and Italian.

We built this model to classify sentiment into:

  • 0 β†’ Negative
  • 1 β†’ Neutral
  • 2 β†’ Positive

✍️ How We Built It

This model was fine-tuned using the Amazon Reviews Multilingual Dataset, specifically on the English, German, and Italian subsets.
Training was done using PyTorch and Hugging Face Transformers.

Preprocessing

  • Texts were tokenized using XLMRobertaTokenizer
  • Labels were mapped to integers (negative: 0, neutral: 1, positive: 2)
  • Dataset was split into train/test/validation using an 80/10/10 ratio

Training

  • Model: xlm-roberta-base
  • Epochs: 2
  • Optimizer: AdamW
  • Batch size: 8
  • Evaluation metric: Macro F1-score
  • Hardware: Google Colab GPU

πŸ” Example Usage

from transformers import pipeline

classifier = pipeline("sentiment-analysis", model="subba5076/multilingual-sentiment-xlm-roberta")

classifier("Der Film war unglaublich schΓΆn.")  # German
classifier("This phone is terrible.")           # English
classifier("È stato un buon acquisto.")         # Italian

πŸ“Š Evaluation
Macro F1-score on test set: 0.81
Confusion matrix and training curves can be shared in future updates.

πŸ‘¨β€πŸ’» Authors
This project was developed as part of a team NLP assignment.

Team Members:

Subrahmanya Rajesh Nayak @subba5076

Rim Tafech

πŸͺͺ License
This model is licensed under the MIT License.
Downloads last month
6
Safetensors
Model size
278M params
Tensor type
F32
Β·
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Dataset used to train subba5076/multilingual-sentiment-xlm-roberta