metadata
language: sv
A Swedish Bert model
Model description
This model has the same architecture as the Bert Large model. It is implemented with the Megatron Bert Architecture containing following parameters:
Hyperparameter | Value |
---|---|
340M | |
24 | |
16 | |
1024 | |
30592 |
Training data
This repository contains a BERT Large model pretrained on a Swedish text corpus of around 80 GB from a variety of sources as shown below.
Dataset | Genre | Size(GB) |
---|---|---|
Anföranden | Politics | 0.9 |
DCEP | Politics | 0.6 |
DGT | Politics | 0.7 |
Fass | Medical | 0.6 |
Författningar | Legal | 0.1 |
Web data | Misc | 45.0 |
JRC | Legal | 0.4 |
Litteraturbanken | Books | 0.3O |
SCAR | Misc | 28.0 |
SOU | Politics | 5.3 |
Subtitles | Drama | 1.3 |
Wikipedia | Facts | 1.8 |
Intended uses & limitations
The raw model can be used for the usual tasks of masked language modeling or next sentence prediction. It is also often fine-tuned on a downstream task to improve its performance in a specific domain/task.
How to use
from transformers import AutoTokenizer, AutoModelForMaskedLM
tokenizer = AutoTokenizer.from_pretrained("AI-Nordics/bert-large-swedish-cased")
model = AutoModelForMaskedLM.from_pretrained("AI-Nordics/bert-large-swedish-cased")