PyTorch
Persian
bert

SINA-BERT: A Pre-trained Language Model for Analysis of Medical Texts in Persian

SINA-BERT is the first Persian medical language model pre-trained on BERT (Devlin et al.,2018). SINA-BERT utilizes pre-training on a large-scale corpus of medical contents including formal and informal texts collected from a variety of online resources in order to improve the performance on health-care related tasks.

Model Evaluation

SINA-BERT can be used for any Persian medical representative task. In our paper we have examined the followings:

  1. categorization of medical questions,
  2. medical sentiment analysis,
  3. and medical question retrieval.

For each task, we have developed Persian annotated data sets, and learnt a representation for the data of each task especially complex and long medical questions. With the same architecture being used across tasks, SINA-BERT outperforms BERT-based models that were previously made available in the Persian language.

To read about the datasets and results, please refer to SINA-BERT paper: arXiv:2104.07613v1

  • Developed by: HooshAfzar Salamat Team
  • Language(s) (NLP): Persian
  • Finetuned from model: ParsBert

Model Sources [optional]

How to use

from transformers import AutoConfig, AutoTokenizer, AutoModel

config = AutoConfig.from_pretrained("hooshafzar/SINA-BERT")
tokenizer = AutoTokenizer.from_pretrained("hooshafzar/SINA-BERT")
model = AutoModel.from_pretrained("hooshafzar/SINA-BERT")

Citation

@article{taghizadeh2021sina,
  title={SINA-BERT: a pre-trained language model for analysis of medical texts in Persian},
  author={Taghizadeh, Nasrin and Doostmohammadi, Ehsan and Seifossadat, Elham and Rabiee, Hamid R and Tahaei, Maedeh S},
  journal={arXiv preprint arXiv:2104.07613},
  year={2021}
}
Downloads last month
17
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for hooshafzar/SINA-BERT

Finetuned
(12)
this model