Dataset Used

This model was trained on the CoNLL 2003 dataset for Named Entity Recognition (NER) tasks.

The dataset includes the following labels:

  • O, B-PER, I-PER, B-ORG, I-ORG, B-LOC, I-LOC, B-MISC, I-MISC

For detailed descriptions of these labels, please refer to the dataset card.

Model Training Details

Training Arguments

  • Model Architecture: bert-base-cased for token classification
  • Learning Rate: 2e-5
  • Number of Epochs: 20
  • Weight Decay: 0.01
  • Evaluation Strategy: epoch
  • Save Strategy: epoch

Additional default parameters from the Hugging Face Transformers library were used.

Evaluation Results

Validation Set Performance

  • Overall Metrics:
    • Precision: 94.44%
    • Recall: 95.74%
    • F1 Score: 95.09%
    • Accuracy: 98.73%

Per-Label Performance

Entity Type Precision Recall F1 Score
LOC 97.27% 97.11% 97.19%
MISC 87.46% 91.54% 89.45%
ORG 93.37% 93.44% 93.40%
PER 96.02% 98.15% 97.07%

Test Set Performance

  • Overall Metrics:
    • Precision: 89.90%
    • Recall: 91.91%
    • F1 Score: 90.89%
    • Accuracy: 97.27%

Per-Label Performance

Entity Type Precision Recall F1 Score
LOC 92.87% 92.87% 92.87%
MISC 75.55% 82.76% 78.99%
ORG 88.32% 90.61% 89.45%
PER 95.28% 96.23% 95.75%

How to Use the Model

You can load the model directly from the Hugging Face Model Hub:

from transformers import pipeline

# Replace with your specific model checkpoint
model_checkpoint = "Prikshit7766/bert-finetuned-ner"
token_classifier = pipeline(
    "token-classification", 
    model=model_checkpoint, 
    aggregation_strategy="simple"
)

# Example usage
result = token_classifier("My name is Sylvain and I work at Hugging Face in Brooklyn.")
print(result)

Example Output

[
   {
      "entity_group":"PER",
      "score":0.9999881,
      "word":"Sylvain",
      "start":11,
      "end":18
   },
   {
      "entity_group":"ORG",
      "score":0.99961376,
      "word":"Hugging Face",
      "start":33,
      "end":45
   },
   {
      "entity_group":"LOC",
      "score":0.99989843,
      "word":"Brooklyn",
      "start":49,
      "end":57
   }
]
Downloads last month
104
Safetensors
Model size
108M params
Tensor type
F32
·
Inference Providers NEW
This model is not currently available via any of the supported third-party Inference Providers, and the model is not deployed on the HF Inference API.

Model tree for Prikshit7766/bert-finetuned-ner

Finetuned
(2060)
this model

Dataset used to train Prikshit7766/bert-finetuned-ner