Turkish Named Entity Recognition (NER) Model

This model is the fine-tuned model of "dbmdz/bert-base-turkish-cased" using a reviewed version of well known Turkish NER dataset (https://github.com/stefan-it/turkish-bert/files/4558187/nerdata.txt).

Fine-tuning parameters:

task = "ner"
model_checkpoint = "dbmdz/bert-base-turkish-cased"
batch_size = 8 
label_list = ['O', 'B-PER', 'I-PER', 'B-ORG', 'I-ORG', 'B-LOC', 'I-LOC']
max_length = 512 
learning_rate = 2e-5 
num_train_epochs = 3 
weight_decay = 0.01 

How to use:

model = AutoModelForTokenClassification.from_pretrained("akdeniz27/bert-base-turkish-cased-ner")
tokenizer = AutoTokenizer.from_pretrained("akdeniz27/bert-base-turkish-cased-ner")
ner = pipeline('ner', model=model, tokenizer=tokenizer, aggregation_strategy="first")
ner("your text here")

Pls refer "https://huggingface.co/transformers/_modules/transformers/pipelines/token_classification.html" for entity grouping with aggregation_strategy parameter.

Reference test results:

  • accuracy: 0.9933935699477056
  • f1: 0.9592969472710453
  • precision: 0.9543530277931161
  • recall: 0.9642923563325274

Evaluation results with the test sets proposed in "Küçük, D., Küçük, D., Arıcı, N. 2016. Türkçe Varlık İsmi Tanıma için bir Veri Kümesi ("A Named Entity Recognition Dataset for Turkish"). IEEE Sinyal İşleme, İletişim ve Uygulamaları Kurultayı. Zonguldak, Türkiye." paper.

  • Test Set Acc. Prec. Rec. F1-Score
  • 20010000 0.9946 0.9871 0.9463 0.9662
  • 20020000 0.9928 0.9134 0.9206 0.9170
  • 20030000 0.9942 0.9814 0.9186 0.9489
  • 20040000 0.9943 0.9660 0.9522 0.9590
  • 20050000 0.9971 0.9539 0.9932 0.9732
  • 20060000 0.9993 0.9942 0.9942 0.9942
  • 20070000 0.9970 0.9806 0.9439 0.9619
  • 20080000 0.9988 0.9821 0.9649 0.9735
  • 20090000 0.9977 0.9891 0.9479 0.9681
  • 20100000 0.9961 0.9684 0.9293 0.9485
  • Overall 0.9961 0.9720 0.9516 0.9617
Downloads last month
73,101
Safetensors
Model size
110M params
Tensor type
I64
·
F32
·
Inference Providers NEW
This model is not currently available via any of the supported third-party Inference Providers, and the model is not deployed on the HF Inference API.

Model tree for akdeniz27/bert-base-turkish-cased-ner

Finetunes
3 models

Spaces using akdeniz27/bert-base-turkish-cased-ner 4