## Dataset Used This model was trained on the [CoNLL 2003 dataset](https://huggingface.co/datasets/eriktks/conll2003), a widely used benchmark for Named Entity Recognition (NER) tasks. The dataset includes the following labels: - `O`, `B-PER`, `I-PER`, `B-ORG`, `I-ORG`, `B-LOC`, `I-LOC`, `B-MISC`, `I-MISC` For detailed descriptions of these labels, please refer to the [dataset card](https://huggingface.co/datasets/eriktks/conll2003). ## Model Training Details ### Training Arguments - **Model Architecture**: `bert-base-cased` for token classification - **Learning Rate**: `2e-5` - **Number of Epochs**: `20` - **Weight Decay**: `0.01` - **Evaluation Strategy**: `epoch` - **Save Strategy**: `epoch` *Additional default parameters from the Hugging Face Transformers library were used.* ## Evaluation Results ### Validation Set Performance - **Overall Metrics**: - Precision: 94.44% - Recall: 95.74% - F1 Score: 95.09% - Accuracy: 98.73% #### Per-Label Performance | Entity Type | Precision | Recall | F1 Score | |------------|-----------|--------|----------| | LOC | 97.27% | 97.11% | 97.19% | | MISC | 87.46% | 91.54% | 89.45% | | ORG | 93.37% | 93.44% | 93.40% | | PER | 96.02% | 98.15% | 97.07% | ### Test Set Performance - **Overall Metrics**: - Precision: 89.90% - Recall: 91.91% - F1 Score: 90.89% - Accuracy: 97.27% #### Per-Label Performance | Entity Type | Precision | Recall | F1 Score | |------------|-----------|--------|----------| | LOC | 92.87% | 92.87% | 92.87% | | MISC | 75.55% | 82.76% | 78.99% | | ORG | 88.32% | 90.61% | 89.45% | | PER | 95.28% | 96.23% | 95.75% | ## How to Use the Model You can load the model directly from the Hugging Face Model Hub: ```python from transformers import pipeline # Replace with your specific model checkpoint model_checkpoint = "Prikshit7766/bert-finetuned-ner-accelerate" token_classifier = pipeline( "token-classification", model=model_checkpoint, aggregation_strategy="simple" ) # Example usage result = token_classifier("My name is Sylvain and I work at Hugging Face in Brooklyn.") print(result) ``` ### Example Output ```python [ { 'entity_group': 'PER', 'score': 0.9999658, 'word': 'Sylvain', 'start': 11, 'end': 18 }, { 'entity_group': 'ORG', 'score': 0.99996203, 'word': 'Hugging Face', 'start': 33, 'end': 45 }, { 'entity_group': 'LOC', 'score': 0.9999542, 'word': 'Brooklyn', 'start': 49, 'end': 57 } ] ```