andreaschari commited on
Commit
45baaa1
·
verified ·
1 Parent(s): 31ea4e5

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +18 -0
README.md ADDED
@@ -0,0 +1,18 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: mit
3
+ datasets:
4
+ - unicamp-dl/mmarco
5
+ language:
6
+ - zh
7
+ base_model:
8
+ - unicamp-dl/mt5-base-mmarco-v2
9
+ ---
10
+
11
+ # mt5-base Reranker ZH mMARCO/v2 Transliterated Queries tokenised with Anserini
12
+
13
+ This is a variation of Unicamp's [mt5-base Reranker](https://huggingface.co/unicamp-dl/mt5-base-mmarco-v2) initially finetuned on mMARCOv/2.
14
+
15
+ The queries are transliterated from Chinese to English text using [uroman](https://github.com/isi-nlp/uroman).
16
+ The queries were tokenised with [pyterrier_anserini](https://github.com/seanmacavaney/pyterrier-anserini/tree/main/pyterrier_anserini).
17
+
18
+ The model was used for the SIGIR 2025 Short paper: Lost in Transliteration: Bridging the Script Gap in Neural IR.