
State-of-the-art Danish Models
These models constitute state-of-the-art models for Danish within their respective domain (highlighted below the model).
Text Generation • 9B • Updated • 133k • • 722Note Best performing open-weight model on ScandEval, ~7-9b generative model which has been instruction-tuned. Rank on ScandEval Danish NLG (2024/12/14): 1.69
google/gemma-2-9b
Text Generation • 9B • Updated • 50.1k • • 663Note Best performing open-weight model on ScandEval, ~7-9b generative model which hasn't been instruction-tuned. Rank on ScandEval Danish NLG (2024/12/14): 2.02
AI-Sweden-Models/roberta-large-1160k
Fill-Mask • 0.4B • Updated • 80 • • 11Note Large-sized encoder. Rank on ScandEval Danish NLU (2024/12/14): 1.38
vesteinn/DanskBERT
Fill-Mask • 0.1B • Updated • 27 • • 6Note Medium-sized encoder. Rank on ScandEval Danish NLU (2024/12/14): 1.56
ltg/norbert3-small
Fill-Mask • Updated • 34.8k • 2Note Small-sized encoder. Rank on ScandEval Danish NLU (2024/12/14): 2.15
openai/whisper-large-v3
Automatic Speech Recognition • 2B • Updated • 3.45M • • 4.69kNote Automatic speech recognition Word error rate on CoRal 28.3
jinaai/jina-embeddings-v3
Feature Extraction • 0.6B • Updated • 4.22M • 1.04kNote Large-sized embedding model with flexible embedding sizes and long-document understanding.
syvai/hviske-v2
2B • Updated • 868 • 13Note Automatic speech recognition based on Whisper 3 and fine-tuned on CoRal Word error rate on CoRal 11.8
intfloat/multilingual-e5-large-instruct
Feature Extraction • 0.6B • Updated • 2.01M • • 537Note Large-sized embedding model with Instructions.
CoRal-project/roest-wav2vec2-315m-v1
Automatic Speech Recognition • 0.3B • Updated • 13 • 12Note Speech Encoer (Wav2Vec2.0) Word error rate on CoRal 17.0
intfloat/multilingual-e5-large
Feature Extraction • 0.6B • Updated • 3.6M • • 989Note Large-sized embedding model.
intfloat/multilingual-e5-base
Sentence Similarity • 0.3B • Updated • 1.62M • • 284Note Medium-sized embedding model.
intfloat/multilingual-e5-small
Sentence Similarity • 0.1B • Updated • 1.66M • • 222Note Small-sized embedding model.
facebook/seamless-m4t-v2-large
Automatic Speech Recognition • 2B • Updated • 43.2k • 864Note Machine translation (and other tasks)