Model Card: BERT for Named Entity Recognition (NER)
Model Overview
This model, bert-conll-ner, is a fine-tuned version of bert-base-uncased
trained for the task of Named Entity Recognition (NER) using the CoNLL-2003 dataset. It is designed to identify and classify entities in text, such as person names (PER), organizations (ORG), locations (LOC), and miscellaneous (MISC) entities.
Model Architecture
- Base Model: BERT (Bidirectional Encoder Representations from Transformers) with the
bert-base-uncased
architecture. - Task: Token Classification (NER).
Training Dataset
- Dataset: CoNLL-2003, a standard dataset for NER tasks containing sentences annotated with named entity spans.
- Classes:
PER
(Person)ORG
(Organization)LOC
(Location)MISC
(Miscellaneous)O
(Outside of any entity span)
Performance Metrics
The model demonstrates strong performance metrics on the CoNLL-2003 evaluation set:
Metric | Value |
---|---|
Loss | 0.0649 |
Precision | 93.59% |
Recall | 95.07% |
F1 Score | 94.32% |
Accuracy | 98.79% |
These metrics indicate the model's high accuracy and robustness in identifying and classifying entities.
Training Details
- Optimizer: AdamW (Adam with weight decay)
- Learning Rate: 2e-5
- Batch Size: 8
- Number of Epochs: 3
- Scheduler: Linear scheduler with warm-up steps
- Loss Function: Cross-entropy loss with ignored index (
-100
) for padding tokens
Model Input/Output
- Input Format: Tokenized text with special tokens
[CLS]
and[SEP]
. - Output Format: Token-level predictions with corresponding labels from the NER tag set (
B-PER
,I-PER
, etc.).
How to Use the Model
Installation
pip install transformers
Loading the Model
from transformers import AutoTokenizer, AutoModelForTokenClassification
tokenizer = AutoTokenizer.from_pretrained("sfarrukh/modernbert-conll-ner")
model = AutoModelForTokenClassification.from_pretrained("sfarrukh/modernbert-conll-ner")
Running Inference
from transformers import pipeline
nlp = pipeline("token-classification", model=model, tokenizer=tokenizer, aggregation_strategy="simple")
text = "John lives in New York City."
result = nlp(text)
print(result)
[{'entity_group': 'PER',
'score': 0.99912304,
'word': 'john',
'start': 0,
'end': 4},
{'entity_group': 'LOC',
'score': 0.9993351,
'word': 'new york city',
'start': 14,
'end': 27}]
Limitations
- Domain-Specific Adaptability: Performance might drop on domain-specific texts (e.g., legal or medical) not covered by the CoNLL-2003 dataset.
- Ambiguity: Ambiguous entities or overlapping spans are not explicitly handled.
Recommendations
- For domain-specific tasks, consider fine-tuning this model further on a relevant dataset.
- Use a pre-processing pipeline to handle long texts by splitting them into smaller segments.
Acknowledgements
- Transformers Library: Hugging Face
- Dataset: CoNLL-2003
- Base Model:
bert-base-uncased
by Google
- Downloads last month
- 9
Inference Providers
NEW
This model is not currently available via any of the supported third-party Inference Providers, and
the model is not deployed on the HF Inference API.
Model tree for sfarrukh/bert-conll-ner
Base model
google-bert/bert-base-uncased