Upload cb-bert-mlm.ipynb
Browse filesThis notebook contains the complete training pipeline for the CB-BERT-MLM model, a domain-adapted masked language model based on bert-base-uncased. It includes:
- Preprocessing of BIS central bank speeches (1996–2024)
- Sentence-level tokenization and masking
- MLM training configuration and execution
- Model evaluation (perplexity, top‑k accuracy, manual masked sentence test)
- Token and parameter statistics for reproducibility
Refer to this notebook for full experimental details and code to replicate model training.
- cb-bert-mlm.ipynb +0 -0
cb-bert-mlm.ipynb
ADDED
The diff for this file is too large to render.
See raw diff
|
|