GLiNER-MoE-MultiLingual: A Zero-Shot Multilingual NER Model with MOE Architecture
This repository provides GLiNER-MoE-MultiLingual, a zero-shot Named Entity Recognition (NER) model trained for one epoch using a Mixture of Experts (MOE) from NOMIC-MOE architecture. GLiNER-MoE-MultiLingual aims to handle zero shot multilingual NER tasks across various domains. Inspired from my work documented on this medium article.
Overview
Named Entity Recognition (NER) is a fundamental task in Natural Language Processing (NLP) where the goal is to detect and classify named entities in text into predefined categories such as persons, locations, organizations, and more.
GLiNER is designed to:
- Perform NER in a zero-shot setting (i.e., it can handle languages and domains it was not explicitly fine-tuned on).
- Leverage a Mixture of Experts (MOE) architecture for improved generalization across languages and domains.
- Serve as a single checkpoint for handling multiple languages, reducing the overhead of training separate models.
Features
- Zero-shot Multilingual Support: Handle NER for many languages without separate fine-tuning.
- Domain Agnostic: The model can generalize across diverse domains (news, biomedical, social media, etc.).
- Lightweight Training: Trained for only one epoch, demonstrating the efficiency of the MOE approach.
- Dataset: MultiLingual Samples were generated from PileNer Samples using machine translation.
- Easy Integration: Built on top of standard NLP frameworks (e.g., Hugging Face Transformers) for quick integration into your pipeline.
Supported Languages
Here is the complete list of supported languages along with their ISO codes:
Code | Language | Code | Language | Code | Language | Code | Language |
---|---|---|---|---|---|---|---|
en | English | be | Belarusian | ml | Malayalam | mk | Macedonian |
es | Spanish | kn | Kannada | ur | Urdu | fy | Frisian |
fr | French | fi | Filipino | te | Telugu | eu | Basque |
de | German | sw | Swahili | so | Somali | sd | Sindhi |
it | Italian | uz | Uzbek | co | Corsican | hr | Croatian |
pt | Portuguese | gu | Gujarati | hi-Latn | Hindi (Latin) | ceb | Cebuano |
pl | Polish | eo | Esperanto | jv | Javanese | la | Latin |
nl | Dutch | zu | Zulu | mn | Mongolian | si | Sinhala |
tr | Turkish | el-Latn | Greek (Latin) | ga | Irish | ky | Kyrgyz |
ja | Japanese | tg | Tajik | my | Burmese | km | Khmer |
vi | Vietnamese | mg | Malagasy | pa | Punjabi | ru-Latn | Russian (Latin) |
ru | Russian | zh-Latn | Chinese (Latin) | ha | Hausa | he | Hebrew |
id | Indonesian | hm | Hmong | ht | Haitian | ja-Latn | Japanese (Latin) |
ar | Arabic | su | Sundanese | bg-Latn | Bulgarian (Latin) | gd | Scots Gaelic |
cs | Czech | ny | Nyanja | ps | Pashto | ku | Kurdish |
ro | Romanian | sh | Serbo-Croatian | am | Amharic | ig | Igbo |
sv | Swedish | lo | Lao | mi | Maori | nn | Norwegian Nynorsk |
el | Greek | sm | Samoan | st | Sotho | tl | Tagalog |
uk | Ukrainian | xh | Xhosa | yo | Yoruba | bn | Bengali |
zh | Chinese | ko | Korean | fa | Persian | ms | Malay |
hu | Hungarian | sl | Slovenian | lv | Latvian | mr | Marathi |
da | Danish | no | Norwegian | hi | Hindi | fi | Finnish |
lt | Lithuanian | ca | Catalan | cy | Welsh | bg | Bulgarian |
This list covers over 40 languages, making GLiNER-MoE-MultiLingual a highly versatile zero-shot multilingual NER model. 🚀
Model Architecture
The Mixture of Experts (MOE) approach splits the model into several “experts,” each of which specializes in a subset of the input space. During inference, the MOE layer routes each token (or hidden state) to the most relevant expert(s). This helps in handling diverse languages and domains under a single unified model.
Performance
GLiNER’s zero-shot performance has been evaluated on various standard NER benchmarks across multiple domains with threshold 0.2:
Dataset | F1 Score |
---|---|
ACE 2004 | 26.2% |
ACE 2005 | 22.5% |
AnatEM | 31.9% |
Broad Tweet Corpus | 65.1% |
CoNLL 2003 | 61.5% |
FabNER | 22.4% |
FindVehicle | 10.6% |
GENIA_NER | 45.1% |
HarveyNER | 3.7% |
MultiNERD | 60.6% |
Ontonotes | 26.0% |
PolyglotNER | 43.1% |
TweetNER7 | 37.4% |
WikiANN en | 54.7% |
WikiNeural | 75.4% |
bc2gm | 54.8% |
bc4chemd | 45.0% |
bc5cdr | 68.2% |
ncbi | 62.9% |
Average | 43.0% |
Usage
Installation
Use this forked repo of Original GLiNER to support MOE
!git clone https://github.com/mayank-rakesh-mck/GLiNER.git
cd GLiNER
pip install -r requirements.txt
Inference with Transformers Pipeline
import json
from GLiNER.gliner import GLiNERConfig, GLiNER
with open('gliner_config.json') as f:
config = json.load(f)
model_config = GLiNERConfig(**config)
model = GLiNER(model_config)
state_dict = torch.load('pytorch_model.bin', map_location=torch.device('cuda:0'), weights_only=True)
model.model.load_state_dict(state_dict, strict=True)
model = model.to('cuda:0')
#english translation
# Sample text for entity prediction
text = """
Cristiano Ronaldo dos Santos Aveiro (Portuguese pronunciation: [kɾiʃˈtjɐnu ʁɔˈnaldu]; born 5 February 1985) is a Portuguese professional footballer who plays as a forward for and captains both Saudi Pro League club Al Nassr and the Portugal national team. Widely regarded as one of the greatest players of all time, Ronaldo has won five Ballon d'Or awards,[note 3] a record three UEFA Men's Player of the Year Awards, and four European Golden Shoes, the most by a European player. He has won 33 trophies in his career, including seven league titles, five UEFA Champions Leagues, the UEFA European Championship and the UEFA Nations League. Ronaldo holds the records for most appearances (183), goals (140) and assists (42) in the Champions League, goals in the European Championship (14), international goals (128) and international appearances (205). He is one of the few players to have made over 1,200 professional career appearances, the most by an outfield player, and has scored over 850 official senior career goals for club and country, making him the top goalscorer of all time.
"""
# Labels for entity prediction
# Most GLiNER models should work best when entity types are in lower case or title case
labels = ["Person", "Award", "Date", "Competitions", "Teams"]
# Perform entity prediction
entities = model.predict_entities(text, labels, threshold=0.2)
# Display predicted entities and their labels
for entity in entities:
print(entity["text"], "=>", entity["label"])
Cristiano Ronaldo => Person
5 February 1985 => Date
Al Nassr => Teams
Portugal national team => Teams
Ballon d'Or => Award
UEFA Men's Player of the Year Awards => Award
European Golden Shoes => Competitions
UEFA Champions Leagues => Competitions
UEFA European Championship => Competitions
UEFA Nations League => Competitions
Champions League => Competitions
European Championship => Competitions
international appearances => Award
Examples
Example usage in a Jupyter notebook cell:
# Language: Armenian
text = """
Կրիշտիանու Ռոնալդու դոս Սանտոս Ավեյրո (պորտուգալերեն արտասանություն՝ [kɾiʃˈtjɐnu ʁɔˈnaldu], ծնված 1985 թվականի փետրվարի 5-ին) պորտուգալացի պրոֆեսիոնալ ֆուտբոլիստ է, ով խաղում է Սաուդյան Արաբիայի Պրոֆեսիոնալ լիգայի և Պորտուգալիայի ազգային հավաքականի հարձակվող և ավագը։ Լայնորեն համարվում է բոլոր ժամանակների լավագույն խաղացողներից մեկը՝ Ռոնալդուն արժանացել է «Ոսկե գնդակի» հինգ մրցանակների, [նշում 3]՝ ռեկորդային երեք՝ ՈւԵՖԱ-ի տարվա լավագույն խաղացողի մրցանակի և Եվրոպայի չորս «Ոսկե խաղակոշիկի»՝ ամենաշատը եվրոպացի խաղացողների կողմից: Նա իր կարիերայի ընթացքում նվաճել է 33 գավաթ, այդ թվում՝ յոթ լիգայի տիտղոս, ՈՒԵՖԱ-ի հինգ Չեմպիոնների լիգա, ՈՒԵՖԱ-ի Եվրոպայի առաջնություն և ՈՒԵՖԱ-ի Ազգերի լիգա: Ռոնալդուն ռեկորդներ ունի Չեմպիոնների լիգայում ամենաշատ խաղերի (183), գոլերի (140) և գոլային փոխանցման (42), Եվրոպայի առաջնությունում (14), միջազգային գոլերի (128) և միջազգային խաղերի (205) ռեկորդների քանակով: Նա այն սակավաթիվ խաղացողներից է, ով անցկացրել է ավելի քան 1200 պրոֆեսիոնալ կարիերա, որոնցից ամենաշատը խաղադաշտ դուրս է եկել, և ավելի քան 850 գոլ է խփել ակումբի և երկրի գլխավոր կարիերայի ընթացքում՝ դարձնելով նրան բոլոր ժամանակների լավագույն ռմբարկուն:"""
# Labels for entity prediction
# Most GLiNER models should work best when entity types are in lower case or title case
labels = ["Person", "Award", "Date", "Competitions", "Teams"]
# Perform entity prediction
entities = model.predict_entities(text, labels, threshold=0.2)
# Display predicted entities and their labels
for entity in entities:
print(entity["text"], "=>", entity["label"])
Կրիշտիանու Ռոնալդու դոս Սանտոս Ավեյրո => Person
1985 => Date
փետրվարի 5-ին => Date
Սաուդյան Արաբիայի Պրոֆեսիոնալ լիգայի => Teams
Պորտուգալիայի ազգային հավաքականի => Teams
Ոսկե գնդակի => Award
Ոսկե խաղակոշիկի => Award
յոթ լիգայի տիտղոս => Award
ՈՒԵՖԱ-ի հինգ Չեմպիոնների լիգա => Competitions
ՈՒԵՖԱ-ի Ազգերի լիգա => Competitions
Ռոնալդուն => Person
Չեմպիոնների լիգայում => Competitions
Եվրոպայի առաջնությունում => Competitions
միջազգային խաղերի => Competitions
# Language: Spanish
text = """
Cristiano Ronaldo dos Santos Aveiro (pronunciación portuguesa: [kɾiʃˈtjɐnu ʁɔˈnaldu]; nacido el 5 de febrero de 1985) es un futbolista profesional portugués que juega como delantero y capitán tanto del club Al Nassr de la Saudi Pro League como de la selección nacional de Portugal. Ampliamente considerado como uno de los mejores jugadores de todos los tiempos, Ronaldo ha ganado cinco premios Balón de Oro, un récord de tres premios al Jugador del Año de la UEFA y cuatro Botas de Oro europeas, la mayor cantidad para un jugador europeo. Ha ganado 33 trofeos en su carrera, incluidos siete títulos de liga, cinco Ligas de Campeones de la UEFA, el Campeonato de Europa de la UEFA y la Liga de Naciones de la UEFA. Ronaldo tiene los récords de más apariciones (183), goles (140) y asistencias (42) en la Liga de Campeones, goles en la Eurocopa (14), goles internacionales (128) y apariciones internacionales (205). Es uno de los pocos jugadores que ha disputado más de 1.200 apariciones en su carrera profesional, la mayor cantidad para un jugador de campo, y ha marcado más de 850 goles oficiales en su carrera absoluta para su club y su país, lo que lo convierte en el máximo goleador de todos los tiempos.
"""
# Labels for entity prediction
# Most GLiNER models should work best when entity types are in lower case or title case
labels = ["Person", "Award", "Date", "Competitions", "Teams"]
# Perform entity prediction
entities = model.predict_entities(text, labels, threshold=0.2)
# Display predicted entities and their labels
for entity in entities:
print(entity["text"], "=>", entity["label"])
Cristiano Ronaldo => Person
5 de febrero de 1985 => Date
Al Nassr => Teams
Saudi Pro League => Teams
Balón de Oro => Award
Jugador del Año => Award
Botas de Oro => Award
títulos de liga => Competitions
Ligas de Campeones => Competitions
Campeonato de Europa => Competitions
Liga de Naciones => Competitions
Liga de Campeones => Competitions
Eurocopa => Competitions
Citation
@misc {mayank_rakesh_2025,
author = { {Mayank Rakesh} },
title = { GLiNER-MoE-MultiLingual (Revision 3ba1ed0) },
year = 2025,
url = { https://huggingface.co/Mayank6255/GLiNER-MoE-MultiLingual },
doi = { 10.57967/hf/4502 },
publisher = { Hugging Face }
}
References
@inproceedings{zaratiana-etal-2024-gliner,
title = "{GL}i{NER}: Generalist Model for Named Entity Recognition using Bidirectional Transformer",
author = "Zaratiana, Urchade and
Tomeh, Nadi and
Holat, Pierre and
Charnois, Thierry",
editor = "Duh, Kevin and
Gomez, Helena and
Bethard, Steven",
booktitle = "Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers)",
month = jun,
year = "2024",
address = "Mexico City, Mexico",
publisher = "Association for Computational Linguistics",
url = "https://aclanthology.org/2024.naacl-long.300",
doi = "10.18653/v1/2024.naacl-long.300",
pages = "5364--5376",
abstract = "Named Entity Recognition (NER) is essential in various Natural Language Processing (NLP) applications. Traditional NER models are effective but limited to a set of predefined entity types. In contrast, Large Language Models (LLMs) can extract arbitrary entities through natural language instructions, offering greater flexibility. However, their size and cost, particularly for those accessed via APIs like ChatGPT, make them impractical in resource-limited scenarios. In this paper, we introduce a compact NER model trained to identify any type of entity. Leveraging a bidirectional transformer encoder, our model, GLiNER, facilitates parallel entity extraction, an advantage over the slow sequential token generation of LLMs. Through comprehensive testing, GLiNER demonstrate strong performance, outperforming both ChatGPT and fine-tuned LLMs in zero-shot evaluations on various NER benchmarks.",
}
@misc{nussbaum2025trainingsparsemixtureexperts,
title={Training Sparse Mixture Of Experts Text Embedding Models},
author={Zach Nussbaum and Brandon Duderstadt},
year={2025},
eprint={2502.07972},
archivePrefix={arXiv},
primaryClass={cs.CL},
url={https://arxiv.org/abs/2502.07972},
}