Severine's picture
Create README.md
feaed04
|
raw
history blame
1.5 kB
metadata
language: sv

A Swedish Bert model

Model description

This model has the same architecture as the Bert Large model. It is implemented with the Megatron Bert Architecture containing following parameters:

Hyperparameter Value
nparametersn_{parameters} 340M
nlayersn_{layers} 24
nheadsn_{heads} 16
nctxn_{ctx} 1024
nvocabn_{vocab} 30592

Training data

This repository contains a BERT Large model pretrained on a Swedish text corpus of around 80 GB from a variety of sources as shown below.

Dataset Genre Size(GB)
Anföranden Politics 0.9
DCEP Politics 0.6
DGT Politics 0.7
Fass Medical 0.6
Författningar Legal 0.1
Web data Misc 45.0
JRC Legal 0.4
Litteraturbanken Books 0.3O
SCAR Misc 28.0
SOU Politics 5.3
Subtitles Drama 1.3
Wikipedia Facts 1.8

Intended uses & limitations

The raw model can be used for the usual tasks of masked language modeling or next sentence prediction. It is also often fine-tuned on a downstream task to improve its performance in a specific domain/task.

How to use

from transformers import AutoTokenizer, AutoModelForMaskedLM
tokenizer = AutoTokenizer.from_pretrained("AI-Nordics/bert-large-swedish-cased")
model = AutoModelForMaskedLM.from_pretrained("AI-Nordics/bert-large-swedish-cased")