metadata

language: sv

A Swedish Bert model

Model description

This model has the same architecture as the Bert Large model. It is implemented with the Megatron Bert Architecture containing following parameters:

Hyperparameter	Value
$n_{parameters}$	340M
$n_{layers}$	24
$n_{heads}$	16
$n_{ctx}$	1024
$n_{vocab}$	30592

Training data

This repository contains a BERT Large model pretrained on a Swedish text corpus of around 80 GB from a variety of sources as shown below.

Dataset	Genre	Size(GB)
Anföranden	Politics	0.9
DCEP	Politics	0.6
DGT	Politics	0.7
Fass	Medical	0.6
Författningar	Legal	0.1
Web data	Misc	45.0
JRC	Legal	0.4
Litteraturbanken	Books	0.3O
SCAR	Misc	28.0
SOU	Politics	5.3
Subtitles	Drama	1.3
Wikipedia	Facts	1.8

Intended uses & limitations

The raw model can be used for the usual tasks of masked language modeling or next sentence prediction. It is also often fine-tuned on a downstream task to improve its performance in a specific domain/task.

How to use

from transformers import AutoTokenizer, AutoModelForMaskedLM
tokenizer = AutoTokenizer.from_pretrained("AI-Nordics/bert-large-swedish-cased")
model = AutoModelForMaskedLM.from_pretrained("AI-Nordics/bert-large-swedish-cased")