## Dataset Used

This model was trained on the [CoNLL 2003 dataset](https://huggingface.co/datasets/eriktks/conll2003), a widely used benchmark for Named Entity Recognition (NER) tasks.

The dataset includes the following labels:
- `O`, `B-PER`, `I-PER`, `B-ORG`, `I-ORG`, `B-LOC`, `I-LOC`, `B-MISC`, `I-MISC`

For detailed descriptions of these labels, please refer to the [dataset card](https://huggingface.co/datasets/eriktks/conll2003).

## Model Training Details

### Training Arguments
- **Model Architecture**: `bert-base-cased` for token classification
- **Learning Rate**: `2e-5`
- **Number of Epochs**: `20`
- **Weight Decay**: `0.01`
- **Evaluation Strategy**: `epoch`
- **Save Strategy**: `epoch`

*Additional default parameters from the Hugging Face Transformers library were used.*

## Evaluation Results

### Validation Set Performance
- **Overall Metrics**:
  - Precision: 94.44%
  - Recall: 95.74%
  - F1 Score: 95.09%
  - Accuracy: 98.73%

#### Per-Label Performance
| Entity Type | Precision | Recall | F1 Score |
|------------|-----------|--------|----------|
| LOC        | 97.27%    | 97.11% | 97.19%   |
| MISC       | 87.46%    | 91.54% | 89.45%   |
| ORG        | 93.37%    | 93.44% | 93.40%   |
| PER        | 96.02%    | 98.15% | 97.07%   |

### Test Set Performance
- **Overall Metrics**:
  - Precision: 89.90%
  - Recall: 91.91%
  - F1 Score: 90.89%
  - Accuracy: 97.27%

#### Per-Label Performance
| Entity Type | Precision | Recall | F1 Score |
|------------|-----------|--------|----------|
| LOC        | 92.87%    | 92.87% | 92.87%   |
| MISC       | 75.55%    | 82.76% | 78.99%   |
| ORG        | 88.32%    | 90.61% | 89.45%   |
| PER        | 95.28%    | 96.23% | 95.75%   |

## How to Use the Model

You can load the model directly from the Hugging Face Model Hub:

```python
from transformers import pipeline

# Replace with your specific model checkpoint
model_checkpoint = "Prikshit7766/bert-finetuned-ner-accelerate"
token_classifier = pipeline(
    "token-classification", 
    model=model_checkpoint, 
    aggregation_strategy="simple"
)

# Example usage
result = token_classifier("My name is Sylvain and I work at Hugging Face in Brooklyn.")
print(result)
```

### Example Output
```python
[
    {
        'entity_group': 'PER', 
        'score': 0.9999658, 
        'word': 'Sylvain', 
        'start': 11, 
        'end': 18
    },
    {
        'entity_group': 'ORG', 
        'score': 0.99996203, 
        'word': 'Hugging Face', 
        'start': 33, 
        'end': 45
    },
    {
        'entity_group': 'LOC', 
        'score': 0.9999542, 
        'word': 'Brooklyn', 
        'start': 49, 
        'end': 57
    }
]
```