bilalzafar commited on
Commit
35d01f9
·
verified ·
1 Parent(s): 040e0ad

Upload cb-bert-mlm.ipynb

Browse files

This notebook contains the complete training pipeline for the CB-BERT-MLM model, a domain-adapted masked language model based on bert-base-uncased. It includes:

- Preprocessing of BIS central bank speeches (1996–2024)
- Sentence-level tokenization and masking
- MLM training configuration and execution
- Model evaluation (perplexity, top‑k accuracy, manual masked sentence test)
- Token and parameter statistics for reproducibility

Refer to this notebook for full experimental details and code to replicate model training.

Files changed (1) hide show
  1. cb-bert-mlm.ipynb +0 -0
cb-bert-mlm.ipynb ADDED
The diff for this file is too large to render. See raw diff