Model Card for Afro-XLMR Fine-Tuned for Regression

This model is a fine-tuned version of davlan/afro-xlmr-base for a regression task on multilingual text data. The model outputs continuous values and has been optimized using the 🤗 Trainer API and Optuna for hyperparameter tuning.


Model Details

Model Description

This model builds upon the Afro-XLM-RoBERTa (afro-xlmr-base), a multilingual language model trained on multiple African languages. We fine-tuned it for a regression task, using a custom dataset where the goal is to predict a numerical score based on input text.

  • Developed by: Michel Roland
  • Model type: Transformer (XLM-RoBERTa variant)
  • Language(s): Multilingual (including many African languages)
  • License: Same as davlan/afro-xlmr-base
  • Fine-tuned from: davlan/afro-xlmr-base

Model Sources


Uses

Direct Use

You can use this model to predict a numerical score from a single text input.

Example:

from transformers import AutoTokenizer, AutoModelForSequenceClassification

model = AutoModelForSequenceClassification.from_pretrained("MichelRoland/Haussa-Afro-xlmr-base")
tokenizer = AutoTokenizer.from_pretrained("MichelRoland/Haussa-Afro-xlmr-base")

inputs = tokenizer("Your input text here", return_tensors="pt")
outputs = model(**inputs)
prediction = outputs.logits.squeeze().item()

Downstream Use

  • Further fine-tuning on similar multilingual regression tasks
  • Plug into a pipeline for text-based scoring

Out-of-Scope Use

Not suitable for classification, text generation, or language translation tasks
Not intended for high-stakes decision-making without further validation


Bias, Risks, and Limitations

  • Inherits limitations from the Afro-XLMR base model
  • Performance may vary across different languages
  • Biases in the training data may propagate in predictions

Recommendations

  • Perform evaluation on your specific dataset
  • Consider additional fine-tuning for domain-specific tasks
  • Monitor fairness and bias in multilingual use-cases

How to Get Started with the Model

from transformers import AutoTokenizer, AutoModelForSequenceClassification

tokenizer = AutoTokenizer.from_pretrained("MichelRoland/Haussa-Afro-xlmr-base")
model = AutoModelForSequenceClassification.from_pretrained("MichelRoland/Haussa-Afro-xlmr-base")

Training Details

Training Data

[Describe your dataset here or link to a dataset card.]

Training Procedure

  • Preprocessing: Tokenization with truncation and padding
  • Loss function: MSELoss (regression)
  • Trainer API: Hugging Face Trainer with compute_metrics returning RMSE and Spearman

Training Hyperparameters

  • Learning Rate: Log-scale search between 1e-5 and 5e-4
  • Epochs: 3–10
  • Batch Size: 8, 16, or 32
  • Weight Decay: 0.0–0.3
  • Trials: 10 (using Optuna)
  • Objective: Minimize eval_rmse
best_run = trainer.hyperparameter_search(
    direction="minimize",
    hp_space=hp_space,
    n_trials=10,
    compute_objective=lambda metrics: metrics["eval_rmse"],
    backend="optuna"
)

Evaluation

Testing Data

https://github.com/semantic-textual-relatedness/Semantic_Relatedness_SemEval2024#dataset

Metrics

  • RMSE (Root Mean Square Error)
  • Spearman’s Rank Correlation

Example result:

{
  "eval_spearman": 0.64
}

License


Downloads last month
3
Safetensors
Model size
278M params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for MichelRoland/Haussa-Afro-xlmr-base

Quantizations
1 model