gravitee-io/bert-small-pii-detection π
This model is based on prajjwal1/bert-small
, a distilled and efficient version of BERT. It features 4 encoder layers, 512 hidden dimensions, and 8 attention heads β making it significantly lighter and faster than bert-base-uncased
while retaining a strong performance on downstream tasks.
The original bert-small
model was pre-trained on English corpora following standard masked language modeling objectives. It is particularly suitable for real-time inference use cases and edge deployments where computational efficiency is critical.
This model has been fine-tuned for Named Entity Recognition (NER) on synthetic multilingual financial PII data from the Gretel.ai dataset. The focus is on detecting sensitive personal information across financial contexts, optimized for English (language: en
).
Evaluation Methodology
Entity Mappings and Label Adjustments
As part of the fine-tuning process, the entity set was intentionally modified and optimized to better reflect our business-specific requirements and real-world usage scenarios.
Key adjustments include:
- Merging or splitting certain entity types to improve classification performance,
- Renaming labels for consistency and clarity in downstream applications,
- Adding or removing entities based on their relevance to financial Personally Identifiable Information (PII) detection.
The full mapping of original entity labels to the adjusted set is provided in the entity_mappings
file.
All reported evaluation metrics are calculated after applying this mapping, ensuring they accurately reflect the model's performance in our target setup.
For our purposes, the entity list was changed. List of changes is described in entity_mappings file
Matching Schemes
The model was evaluated using nervaluate, with two matching schemes to reflect both practical and strict performance scenarios:
Exact Match:
- An entity prediction is considered correct only if the predicted entity exactly matches both the entity boundaries (start and end tokens) and the entity label.
- This approach penalizes both boundary errors and misclassifications, providing a conservative estimate of model performance β useful for applications where exact localization is critical (e.g., redaction).
Entity Type Match:
- An entity is counted as correct if there is any overlap between the predicted span and the ground truth span, and the predicted label matches the true label.
- This scheme is more permissive and rewards partial matches, suitable for exploratory analysis and scenarios where partial detection is still valuable (e.g., highlighting sensitive spans).
If you want more about evaluation methodology take a look an article about nervaluate
Metrics
Entity Type
model | language | precision | recall | f1-score |
---|---|---|---|---|
prajjwal1-bert-small_1_onnx | English | 0.90 | 0.88 | 0.89 |
prajjwal1-bert-small_1_onnx_quant | English | 0.90 | 0.88 | 0.89 |
Exact Match
model | language | precision | recall | f1-score |
---|---|---|---|---|
prajjwal1-bert-small_1_onnx | English | 0.82 | 0.80 | 0.81 |
prajjwal1-bert-small_1_onnx_quant | English | 0.82 | 0.80 | 0.81 |
Entity Type (entity)
model | language | entity | precision | recall | f1-score |
---|---|---|---|---|---|
prajjwal1-bert-small_1_onnx | English | company | 0.84 | 0.84 | 0.84 |
prajjwal1-bert-small_1_onnx | English | date_time | 0.90 | 0.91 | 0.91 |
prajjwal1-bert-small_1_onnx | English | 0.98 | 0.96 | 0.97 | |
prajjwal1-bert-small_1_onnx | English | misc | 0.90 | 0.87 | 0.88 |
prajjwal1-bert-small_1_onnx | English | name | 0.94 | 0.85 | 0.89 |
prajjwal1-bert-small_1_onnx | English | phone_number | 0.89 | 0.94 | 0.91 |
prajjwal1-bert-small_1_onnx | English | street_address | 0.90 | 0.86 | 0.88 |
prajjwal1-bert-small_1_onnx_quant | English | company | 0.84 | 0.85 | 0.84 |
prajjwal1-bert-small_1_onnx_quant | English | date_time | 0.90 | 0.91 | 0.91 |
prajjwal1-bert-small_1_onnx_quant | English | 0.98 | 0.96 | 0.97 | |
prajjwal1-bert-small_1_onnx_quant | English | misc | 0.89 | 0.86 | 0.88 |
prajjwal1-bert-small_1_onnx_quant | English | name | 0.94 | 0.85 | 0.89 |
prajjwal1-bert-small_1_onnx_quant | English | phone_number | 0.88 | 0.94 | 0.91 |
prajjwal1-bert-small_1_onnx_quant | English | street_address | 0.90 | 0.86 | 0.88 |
Exact Match (entity)
model | language | entity | precision | recall | f1-score |
---|---|---|---|---|---|
prajjwal1-bert-small_1_onnx | English | company | 0.79 | 0.79 | 0.79 |
prajjwal1-bert-small_1_onnx | English | date_time | 0.75 | 0.76 | 0.76 |
prajjwal1-bert-small_1_onnx | English | 0.93 | 0.91 | 0.92 | |
prajjwal1-bert-small_1_onnx | English | misc | 0.83 | 0.80 | 0.81 |
prajjwal1-bert-small_1_onnx | English | name | 0.89 | 0.81 | 0.85 |
prajjwal1-bert-small_1_onnx | English | phone_number | 0.89 | 0.94 | 0.91 |
prajjwal1-bert-small_1_onnx | English | street_address | 0.86 | 0.82 | 0.84 |
prajjwal1-bert-small_1_onnx_quant | English | company | 0.78 | 0.79 | 0.78 |
prajjwal1-bert-small_1_onnx_quant | English | date_time | 0.75 | 0.76 | 0.76 |
prajjwal1-bert-small_1_onnx_quant | English | 0.93 | 0.91 | 0.92 | |
prajjwal1-bert-small_1_onnx_quant | English | misc | 0.82 | 0.79 | 0.81 |
prajjwal1-bert-small_1_onnx_quant | English | name | 0.89 | 0.81 | 0.85 |
prajjwal1-bert-small_1_onnx_quant | English | phone_number | 0.88 | 0.93 | 0.91 |
prajjwal1-bert-small_1_onnx_quant | English | street_address | 0.86 | 0.82 | 0.84 |
Citation
@misc{bhargava2021generalization,
title={Generalization in NLI: Ways (Not) To Go Beyond Simple Heuristics},
author={Prajjwal Bhargava and Aleksandr Drozd and Anna Rogers},
year={2021},
eprint={2110.01518},
archivePrefix={arXiv},
primaryClass={cs.CL}
}
@article{DBLP:journals/corr/abs-1908-08962,
author = {Iulia Turc and
Ming{-}Wei Chang and
Kenton Lee and
Kristina Toutanova},
title = {Well-Read Students Learn Better: The Impact of Student Initialization
on Knowledge Distillation},
journal = {CoRR},
volume = {abs/1908.08962},
year = {2019},
url = {http://arxiv.org/abs/1908.08962},
eprinttype = {arXiv},
eprint = {1908.08962},
timestamp = {Thu, 29 Aug 2019 16:32:34 +0200},
biburl = {https://dblp.org/rec/journals/corr/abs-1908-08962.bib},
bibsource = {dblp computer science bibliography, https://dblp.org}
}
- Downloads last month
- 30
Model tree for gravitee-io/bert-small-pii-detection
Base model
prajjwal1/bert-small