BERT-Breaks (v0) β Coming Soon π§
Status: Model training and evaluation planned β baseline placeholder repository.
Overview
BERT-Breaks-v0
serves as the vanilla BERT baseline for the Exception Handling & Reconciliation project.
It will be trained on the same corpus as our DistilBERT-Reconciler
β 3.2M labeled post-trade break descriptions and resolution actions β but using the original bert-base-uncased
architecture.
The goal is to provide a performance benchmark against which lightweight and distilled models can be evaluated.
Intended Use
Automated classification of reconciliation exceptions in fixed-income settlement workflows (CUSIP/ISIN).
The model will output a label_id
mapped to a human-readable root-cause and recommended resolution step.
Planned Training Details
- Base:
bert-base-uncased
- Epochs: TBD (expected 3β5)
- Learning Rate: TBD (expected ~3e-5)
- Max Length: 256
- Dataset: Proprietary + ISO 20022-derived corpus (post-trade break descriptions)
- Split: 80% train / 20% hold-out
- Evaluation Metrics: Accuracy, Micro-F1, Macro-F1
Expected Benchmark
Model | Accuracy | Micro-F1 | Macro-F1 |
---|---|---|---|
DistilBERT-Reconciler | 0.88 | 0.88 | 0.85 |
BERT-Breaks-v0 | (Coming) | (Coming) | (Coming) |
Limitations & Bias
- Labels are derived from North-American corporate-bond desks (2019β2025).
- May under-perform on equities, repos, or non-USD instruments without re-training.
- Baseline model is expected to have larger inference latency compared to distilled variants.
Citation
Musodza, K. (2025). Bond Settlement Automated Exception Handling and Reconciliation. Zenodo. https://doi.org/10.5281/zenodo.16828730
Related Models
DistilBERT-Reconciler
β Fine-tuned lightweight alternative.Streaming-fail-forecaster
β Next-day settlement-fail forecasting models.settlement-stress-flagger-v1
β CUSIP-level stress-event classifier.