Note: New Improved Model Available (F1 Score: 89.04%)

For better performance and an improved F1 score, please use the updated model here: https://huggingface.co/Ishan0612/biobert-ner-disease-ncbi

BioBERT Disease NER Model

One of the powerful medical NER models, fine-tuned on BioBERT with the NCBI Disease dataset. It achieves 98.79% accuracy and an F1-score of 86.98%, delivering reliable performance for disease extraction tasks by accurately identifying diseases and symptoms in medical texts.

Model Performance

  • Precision: 85.69%
  • Recall: 88.31%
  • F1-Score: 86.98%
  • Accuracy: 98.79%

โœ… Fine-tuned over 6,800+ annotated examples for 5 epochs, achieving consistently high validation scores.

Intended Use

This model is designed for:

  • Extracting disease mentions from clinical and biomedical texts.
  • Powering information retrieval, research automation, and medical chatbots.

Training Data

This model was trained on the NCBI disease dataset, which consists of 793 PubMed abstracts with 6892 disease mentions.

How to Use

You can use this model with the Hugging Face Transformers library:

Note: LABEL_0 corresponds to "O" (Outside), LABEL_1 to "B-Disease", and LABEL_2 to "I-Disease" following the BIO tagging format.

from transformers import pipeline

# Load from Hugging Face
nlp = pipeline("ner", model="Ishan0612/biobert_medical_ner", tokenizer="Ishan0612/biobert_medical_ner", aggregation_strategy="simple")

# Sample medical text
text = """Robert suffering from chest pain and thiroid."""

# Extract entities
ner_results = nlp(text)

# Display results
print("Extracted Medical Entities:")
for entity in ner_results:
    print(f"{entity['word']} ({entity['entity_group']}) - Confidence: {entity['score']:.2f}")

This should output: Extracted Medical Entities: Robert suffering from (LABEL_0) - Confidence: 1.00 chest (LABEL_1) - Confidence: 1.00 pain (LABEL_2) - Confidence: 1.00 and (LABEL_0) - Confidence: 1.00 th (LABEL_1) - Confidence: 1.00 ##iroid (LABEL_2) - Confidence: 0.97 . (LABEL_0) - Confidence: 1.00

Downloads last month
52
Safetensors
Model size
108M params
Tensor type
F32
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for Ishan0612/biobert-ner-disease

Finetuned
(18)
this model

Dataset used to train Ishan0612/biobert-ner-disease