emillykkejensen's picture
Upload folder using huggingface_hub
cf1e7ba verified
metadata
language:
  - da
  - 'no'
  - sv
tags:
  - embeddings
  - sentence-transformers
  - scandinavian
  - semantic-search
  - retrieval
license: apache-2.0

Qwen3-Embedding-Scandi-0.6B

Hugging Face Fine-tuned version of Qwen/Qwen3-Embedding-0.6B for Scandinavian text embeddings (Danish, Norwegian, Swedish).


Model Summary

  • Base model: Qwen/Qwen3-Embedding-0.6B
  • Architecture: Transformer-based embedding model (0.6B parameters)
  • Fine-tuning: LoRA + Swift, merged into base weights
  • Task: Sentence and document embeddings for retrieval, clustering, and semantic similarity
  • Languages: ๐Ÿ‡ฉ๐Ÿ‡ฐ Danish, ๐Ÿ‡ธ๐Ÿ‡ช Swedish, ๐Ÿ‡ณ๐Ÿ‡ด Norwegian

Intended Use

This model is intended for representation learning tasks such as:

  • Semantic search
  • Text clustering
  • Document retrieval
  • Reranking pipelines

Not recommended for text generation.


Training Details


Checkpoints

  • LoRA weights merged into the base model.
  • SafeTensors format used for efficiency.
  • Tokenizer from base model copied for compatibility.

Limitations & Bias

  • Limited to Scandinavian languages (other languages may work poorly).
  • Embeddings are sensitive to domain shift (best results on text similar to training data).
  • As with all language models, embeddings may encode societal biases present in the training data.

Acknowledgements