Mavkif
/

roman-urdu-mt5-mmarco

Question Answering

Model card Files Files and versions

roman-urdu-mt5-mmarco / README.md

Mavkif's picture

Update README.md

3bd19f8 verified 4 months ago

|

history blame contribute delete

1.96 kB

	---
	license: apache-2.0
	datasets:
	- Mavkif/roman-urdu-msmarco-dataset
	language:
	- ur
	base_model:
	- unicamp-dl/mt5-base-mmarco-v2
	pipeline_tag: question-answering
	tags:
	- mt5
	- information
	- retrieval
	- NLP
	- urdu
	- roman-urdu
	---


	# Roman Urdu mT5 msmarco: Fine-Tuned mT5 Model for Roman-Urdu Information Retrieval

	As part of ongoing efforts to make Information Retrieval (IR) more inclusive, this model addresses the needs of low-resource languages, focusing specifically on Urdu.
	We created this model by translating the MS-Marco dataset into Roman-Urdu using the IndicTrans2 model.
	To establish baseline performance, we initially tested for zero-shot learning for IR in Roman-Urdu using the unicamp-dl/mt5-base-mmarco-v2 model
	and then applied fine-tuning with the mMARCO multilingual IR methodology on the translated dataset, resulting in State-Of-The-Art results for urdu IR

	## Model Details

	### Model Description



	- Developed by: Umer Butt
	- Model type: IR model for reranking
	- Language(s) (NLP): Python/pytorch




	## Bias, Risks, and Limitations

	Although this model performs well and is state-of-the-art for now. But still this model is finetuned on mmarco model and a translated dataset(which was created using indicTrans2 model). Hence the limitations of those apply here too.



	## Evaluation

	The evaluation was done using the scripts in the pygaggle library. Specifically these files:
	evaluate_monot5_reranker.py
	ms_marco_eval.py


	### Model Architecture and Objective
	```json
	{
	"_name_or_path": "unicamp-dl/mt5-base-mmarco-v2",
	"architectures": ["MT5ForConditionalGeneration"],
	"d_model": 768,
	"num_heads": 12,
	"num_layers": 12,
	"dropout_rate": 0.1,
	"vocab_size": 250112,
	"model_type": "mt5",
	"transformers_version": "4.45.2"
	}
	```
	For more details on how to customize the decoding parameters (such as max_length, num_beams, and early_stopping), refer to the Hugging Face documentation.