DAL-BERT: Another pre-trained language model for Persian

DAL-BERT is a transformer-based model trained on more than 80 gigabytes of Persian text including both formal and informal (conversational) contexts. The architecture of this model follows the original BERT [Devlin et al.].

How to use the Model

from transformers import BertForMaskedLM, BertTokenizer, pipeline
model = BertForMaskedLM.from_pretrained('sharif-dal/dal-bert')
tokenizer = BertTokenizer.from_pretrained('sharif-dal/dal-bert')
fill_sentence = pipeline('fill-mask', model=model, tokenizer=tokenizer)
fill_sentence('اینجا جمله مورد نظر خود را بنویسید و کلمه موردنظر را [MASK] کنید')

The Training Data

The abovementioned model was trained on a bunch of newspapers, news agencies' websites, technology-related sources, people's comments, magazines, literary criticism, and some blogs.

Evaluation

Training Loss	Epoch	Step
2.1855	13	7649486

Contributors

Arman Malekzadeh [Github]
Amirhossein Ramazani, Master's Student in AI @ Sharif University of Technology [Linkedin] [Github]

Downloads last month: 18

Model tree for sharif-dal/dal-bert

Finetunes

2 models

Paper for sharif-dal/dal-bert

BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

Paper • 1810.04805 • Published Oct 11, 2018 • 25