You need to agree to share your contact information to access this model

This repository is publicly accessible, but you have to accept the conditions to access its files and content.

mT5-Small (Taxi1500 Maltese)

This model is a fine-tuned version of google/mt5-small on the Taxi1500 dataset. It achieves the following results on the test set:

Loss: 0.7
F1: 0.4220

Intended uses & limitations

The model is fine-tuned on a specific task and it should be used on the same or similar task. Any limitations present in the base model are inherited.

Training procedure

The model was fine-tuned using a customised script.

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.001
train_batch_size: 32
eval_batch_size: 32
seed: 42
optimizer: Use adafactor and the args are: No additional optimizer arguments
lr_scheduler_type: linear
num_epochs: 200.0
early_stopping_patience: 20

Training results

Training Loss	Epoch	Step	Validation Loss	F1
No log	1.0	27	7.1210	0.4432
No log	2.0	54	0.7519	0.4546
No log	3.0	81	0.7215	0.3865
No log	4.0	108	0.7781	0.4213
No log	5.0	135	0.7418	0.3728
No log	6.0	162	0.7876	0.3881
No log	7.0	189	0.8915	0.3570
No log	8.0	216	0.7115	0.3611
No log	9.0	243	0.7800	0.3487
No log	10.0	270	0.7971	0.3928
No log	11.0	297	0.7406	0.3707
No log	12.0	324	0.7309	0.3527
No log	13.0	351	0.6971	0.4233
No log	14.0	378	0.8458	0.3515
No log	15.0	405	0.7301	0.3515
No log	16.0	432	2.9614	0.1838
No log	17.0	459	0.7779	0.1903
No log	18.0	486	0.7124	0.3556
2.0694	19.0	513	0.7182	0.425
2.0694	20.0	540	0.7275	0.4385
2.0694	21.0	567	0.7660	0.3660
2.0694	22.0	594	0.7280	0.3556

Framework versions

Transformers 4.48.2
Pytorch 2.4.1+cu121
Datasets 3.2.0
Tokenizers 0.21.0

License

This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License. Permissions beyond the scope of this license may be available at https://mlrs.research.um.edu.mt/.

Citation

This work was first presented in MELABenchv1: Benchmarking Large Language Models against Smaller Fine-Tuned Models for Low-Resource Maltese NLP. Cite it as follows:

@inproceedings{micallef-borg-2025-melabenchv1,
    title = "{MELAB}enchv1: Benchmarking Large Language Models against Smaller Fine-Tuned Models for Low-Resource {M}altese {NLP}",
    author = "Micallef, Kurt  and
      Borg, Claudia",
    editor = "Che, Wanxiang  and
      Nabende, Joyce  and
      Shutova, Ekaterina  and
      Pilehvar, Mohammad Taher",
    booktitle = "Findings of the Association for Computational Linguistics: ACL 2025",
    month = jul,
    year = "2025",
    address = "Vienna, Austria",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2025.findings-acl.1053/",
    doi = "10.18653/v1/2025.findings-acl.1053",
    pages = "20505--20527",
    ISBN = "979-8-89176-256-5",
}

Downloads last month: -

Safetensors

Model size

0.3B params

Tensor type

F32

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for MLRS/mt5-small_taxi1500-mlt

Base model

google/mt5-small

Finetuned

(595)

this model

Collection including MLRS/mt5-small_taxi1500-mlt

mT5-Small

Collection

A collection of fine-tuned mT5-Small models on MELABench tasks. • 9 items • Updated Aug 19

Evaluation results

Macro-averaged F1 on taxi1500
MELABench Leaderboard

75.190

View on Papers With Code