maux-gte-persian-v3 (fp16)

A high-performance Persian sentence embedding model based on Alibaba-NLP/gte-multilingual-base, released in fp16 for efficient inference.


Model Overview

This is the fp16 (half-precision) version of maux-gte-persian-v3, a Sentence Transformers model fine-tuned from Alibaba-NLP/gte-multilingual-base for robust Persian sentence and paragraph embeddings.
The fp16 format enables faster and more memory-efficient inference, especially on modern GPUs.

Key Features:

  • Base Model: Alibaba-NLP/gte-multilingual-base
  • Fine-tuned on: mshojaei77/Persian_sft (80,000 Persian sentence pairs)
  • Output Dimension: 768
  • Max Sequence Length: 8192 tokens
  • Similarity Function: Cosine Similarity
  • Loss Function: MultipleNegativesRankingLoss
  • Format: fp16 (model.safetensors)

Performance

  • Excellent performance on Persian semantic similarity, search, and clustering tasks.
  • Outperforms or matches jinaai-v3 in most Persian benchmarks (see comparison charts).
  • Efficient for large-scale inference due to fp16 weights.

Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 8192, 'do_lower_case': False}) with Transformer model: NewModel 
  (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, ...})
  (2): Normalize()
)

Usage

from sentence_transformers import SentenceTransformer

model = SentenceTransformer("xmanii/maux-gte-persian-v3-fp16", trust_remote_code=True)
sentences = [
    'برج میلاد در تهران هست',
    'یکی از برج های مسکونی تهران برج تهران است',
    'تهران برج های زیادی دارد'
]
embeddings = model.encode(sentences)
print(embeddings.shape)  # [3, 768]

# Compute cosine similarity
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)  # [3, 3]

Training Details

  • Dataset: mshojaei77/Persian_sft
  • Loss: MultipleNegativesRankingLoss (scale=20.0, similarity_fct="cos_sim")
  • Batch size: 64
  • Precision: bf16 during training, fp16 for this release
  • Frameworks: Python 3.10, Sentence Transformers 4.1.0, Transformers 4.51.3, PyTorch 2.7.0+cu126

Files

  • model.safetensors (fp16 weights)
  • All necessary config and tokenizer files
  • Custom code: modeling.py, configuration.py (required for loading)

Citation

If you use this model, please cite:

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}
@misc{henderson2017efficient,
    title={Efficient Natural Language Response Suggestion for Smart Reply},
    author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
    year={2017},
    eprint={1705.00652},
    archivePrefix={arXiv},
    primaryClass={cs.CL}
}

Acknowledgements


License

This model is distributed under the same license as the base model and dataset.


For questions or feedback, please open an issue or discussion on the Hugging Face model page.

Downloads last month
22
Safetensors
Model size
305M params
Tensor type
F16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for xmanii/maux-gte-persian-v3-fp16

Finetuned
(71)
this model

Dataset used to train xmanii/maux-gte-persian-v3-fp16

Collection including xmanii/maux-gte-persian-v3-fp16