Model Card for Afro-XLMR Fine-Tuned for Regression
This model is a fine-tuned version of davlan/afro-xlmr-base
for a regression task on multilingual text data. The model outputs continuous values and has been optimized using the 🤗 Trainer
API and Optuna
for hyperparameter tuning.
Model Details
Model Description
This model builds upon the Afro-XLM-RoBERTa (afro-xlmr-base
), a multilingual language model trained on multiple African languages. We fine-tuned it for a regression task, using a custom dataset where the goal is to predict a numerical score based on input text.
- Developed by: Michel Roland
- Model type: Transformer (XLM-RoBERTa variant)
- Language(s): Multilingual (including many African languages)
- License: Same as
davlan/afro-xlmr-base
- Fine-tuned from:
davlan/afro-xlmr-base
Model Sources
- Base model: https://huggingface.co/davlan/afro-xlmr-base
- Fine-tuned repo: None
Uses
Direct Use
You can use this model to predict a numerical score from a single text input.
Example:
from transformers import AutoTokenizer, AutoModelForSequenceClassification
model = AutoModelForSequenceClassification.from_pretrained("MichelRoland/Haussa-Afro-xlmr-base")
tokenizer = AutoTokenizer.from_pretrained("MichelRoland/Haussa-Afro-xlmr-base")
inputs = tokenizer("Your input text here", return_tensors="pt")
outputs = model(**inputs)
prediction = outputs.logits.squeeze().item()
Downstream Use
- Further fine-tuning on similar multilingual regression tasks
- Plug into a pipeline for text-based scoring
Out-of-Scope Use
Not suitable for classification, text generation, or language translation tasks
Not intended for high-stakes decision-making without further validation
Bias, Risks, and Limitations
- Inherits limitations from the Afro-XLMR base model
- Performance may vary across different languages
- Biases in the training data may propagate in predictions
Recommendations
- Perform evaluation on your specific dataset
- Consider additional fine-tuning for domain-specific tasks
- Monitor fairness and bias in multilingual use-cases
How to Get Started with the Model
from transformers import AutoTokenizer, AutoModelForSequenceClassification
tokenizer = AutoTokenizer.from_pretrained("MichelRoland/Haussa-Afro-xlmr-base")
model = AutoModelForSequenceClassification.from_pretrained("MichelRoland/Haussa-Afro-xlmr-base")
Training Details
Training Data
[Describe your dataset here or link to a dataset card.]
Training Procedure
- Preprocessing: Tokenization with truncation and padding
- Loss function: MSELoss (regression)
- Trainer API: Hugging Face
Trainer
withcompute_metrics
returning RMSE and Spearman
Training Hyperparameters
- Learning Rate: Log-scale search between 1e-5 and 5e-4
- Epochs: 3–10
- Batch Size: 8, 16, or 32
- Weight Decay: 0.0–0.3
- Trials: 10 (using Optuna)
- Objective: Minimize
eval_rmse
best_run = trainer.hyperparameter_search(
direction="minimize",
hp_space=hp_space,
n_trials=10,
compute_objective=lambda metrics: metrics["eval_rmse"],
backend="optuna"
)
Evaluation
Testing Data
https://github.com/semantic-textual-relatedness/Semantic_Relatedness_SemEval2024#dataset
Metrics
- RMSE (Root Mean Square Error)
- Spearman’s Rank Correlation
Example result:
{
"eval_spearman": 0.64
}
License
- Downloads last month
- 3