SentenceTransformer
This is a sentence-transformers model trained. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
Model Details
Model Description
- Model Type: Sentence Transformer
- Maximum Sequence Length: 2048 tokens
- Output Dimensionality: 768 dimensions
- Similarity Function: Cosine Similarity
Model Sources
- Documentation: Sentence Transformers Documentation
- Repository: Sentence Transformers on GitHub
- Hugging Face: Sentence Transformers on Hugging Face
Full Model Architecture
SentenceTransformer(
(0): Transformer({'max_seq_length': 2048, 'do_lower_case': False, 'architecture': 'Gemma3TextModel'})
(1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
(2): Dense({'in_features': 768, 'out_features': 3072, 'bias': False, 'activation_function': 'torch.nn.modules.linear.Identity'})
(3): Dense({'in_features': 3072, 'out_features': 768, 'bias': False, 'activation_function': 'torch.nn.modules.linear.Identity'})
(4): Normalize()
)
Usage
Direct Usage (Sentence Transformers)
First install the Sentence Transformers library:
pip install -U sentence-transformers
Then you can load this model and run inference.
from sentence_transformers import SentenceTransformer
# Download from the 🤗 Hub
model = SentenceTransformer("Saidakmal/uz_embeddinggemma-300m")
# Run inference
queries = [
"Lady antebellum ismi qayerdan kelib chiqqan ?",
]
documents = [
'Lady Antebellum 2010 yil 9 avgust kuni BBC Radio 2 Drivetime Show-da guruh uy egasiga Liza Tarbuqqa Antebellum nomi guruh "avval" uylarini suratga olganida kelib chiqqanligini tushuntirdi. Avval urushdan oldingi arxitektura usuli Amerika Janubiyidagi katta plantatsiya uylarini tasvirlaydi. Latindagi bellum so\'zi " urush" degani; "avval" demak " urushdan oldin " degani.',
'Necrotising fasciitis B.C. 5-asrda Hippokrates necrotising yumshoq to\'qima infektsiyasini Streptococcal infeksiyaning komplikasiyasi bo\'lgan kasallik deb tasvirlagan. Bu kasallik "tanamizning barcha qismida eritsipellalarga ega bo\'lgan, sababi esa oddiy hodisa edi. Suyaklar, go\'sht va suyaklar (qut, tendon yoki nerv) tanadan tushib, ko\'plab o\'limlar yuz berdi". Necrotising yumshoq to\'qima infektsiyasini birinchi marta ingliz shifokor Leonard Gillespie va ingliz shifokor Gilbert Blaine va Tomas Trotter tomonidan 18 asrda tavsiflab berilgan edi. O\'sha paytda necrotising yumshoq to\'qima infektsiyasi pedaenik (g\'irni-qizish yoki g\'angrenni bosish) deb nomlangan.',
"Sutro yo'li Quyosh Orion qo'li ichki chekkasi yaqinida, Mahalliy Bubble mahalliy Fluff ichida va Gould Beltda, Galaktik markazidan 26,4 ± 1,0 kly (8,09 ± 0,31 kpc) masofada joylashgan. Quyosh hozirda Galaktik diskning markaziy toshidan 530 parsek (1698 ly) uzoqlikda joylashgan.",
]
query_embeddings = model.encode_query(queries)
document_embeddings = model.encode_document(documents)
print(query_embeddings.shape, document_embeddings.shape)
# [1, 768] [3, 768]
# Get the similarity scores for the embeddings
similarities = model.similarity(query_embeddings, document_embeddings)
print(similarities)
# tensor([[ 0.6160, 0.1431, -0.0269]])
Evaluation
Metrics
Information Retrieval
- Evaluated with
InformationRetrievalEvaluator
Metric | Value |
---|---|
cosine_accuracy@1 | 0.598 |
cosine_accuracy@3 | 0.762 |
cosine_accuracy@5 | 0.811 |
cosine_accuracy@10 | 0.865 |
cosine_precision@1 | 0.598 |
cosine_precision@3 | 0.254 |
cosine_precision@5 | 0.1622 |
cosine_precision@10 | 0.0865 |
cosine_recall@1 | 0.598 |
cosine_recall@3 | 0.762 |
cosine_recall@5 | 0.811 |
cosine_recall@10 | 0.865 |
cosine_ndcg@10 | 0.7329 |
cosine_mrr@10 | 0.6904 |
cosine_map@100 | 0.6946 |
Training Details
Training Dataset
Unnamed Dataset
- Size: 18,000 training samples
- Columns:
sentence_0
,sentence_1
,sentence_2
, andsentence_3
- Approximate statistics based on the first 1000 samples:
sentence_0 sentence_1 sentence_2 sentence_3 type string string string string details - min: 7 tokens
- mean: 18.8 tokens
- max: 55 tokens
- min: 27 tokens
- mean: 162.4 tokens
- max: 985 tokens
- min: 11 tokens
- mean: 159.54 tokens
- max: 945 tokens
- min: 26 tokens
- mean: 158.32 tokens
- max: 754 tokens
- Samples:
sentence_0 sentence_1 sentence_2 sentence_3 koʻchada oʻtirganlarni kuylagan
(Sittin' On) The Dock of the Bay "Sittin' On) The Dock of the Bay" - soul qo'shiqchisi Otis Redding va gitarochi Steve Cropper tomonidan birgalikda yozilgan qo'shiq. Redding tomonidan 1967 yilda ikki marta, shu jumladan u samolyot halok bo'lishidan bir necha kun oldin, yozib olingan. Qo'shiq 1968 yilda Stax Recordsning Volt kompaniyasida chiqarilgan, [1] AQShda reytinglar safida birinchi o'limdan keyingi singl bo'lib chiqdi.
Sidney Harbour Bridge Ko'prikning umumiy moliyaviy qiymati 6,25 million funt funtligidan iborat edi, bu 1988 yilgacha to'liq to'lanmagan. [1]
Saudiya Arabistonining siyosati Saudiya Arabistonining siyosati ayrim islom yo'nalishlari bo'lgan mutlaq monarxiya kontestida amalga oshiriladi, unda shoh davlat va hukumat rahbari bo'lib xizmat qiladi. Qarorlar katta darajada shoh oilasi va diniy muassasalarning katta ruhoniylari o'rtasida maslahatlashuv asosida qabul qilinadi. Qur'on mamlakat konstitutsiyasi deb e'lon qilinadi, u islom qonuni asosida boshqaradi (Shari'a). Yangi shoh va yangi nasl prinsi tayinlash uchun sodiqlik kengashi mas'ul. To'liq yoshdagi barcha fuqarolar majlis deb nomlangan an'anaviy qabilaviy majlis orqali to'g'ridan-to'g'ri shohga tashrif buyurish, uchrashish va iltimos qilish huquqiga ega.[1]
Hindistondagi yer buzilishining sabablarini tushuntiring
Yerni buzish O'tkir cho'chqachilik - chorva mollarini ko'tarib oluvchi quvvatdan ortiq darajada chorvachilik bilan tabiiy o'tlar o'tishi; natijada o'simlik qoplamasining pasayishi shamol va suv eroziyasining asosiy sababidir. Bu Afg'onistonda muhim omil hisoblanadi. 1980-1990 yillarda aholi bosimining oshishi, sakkiz mamlakatdan oltida har bir kishiga nisbatan qishloq xo'jaligi yerlarining allaqachon kichik maydonlarida pasayishlarga olib keldi (14% Hindiston uchun va 21% Pokiston uchun).
O'q-po'drat texnologiyasining tarixi O'n to'rtinchi asr o'rtalarida Hindistonga kelgan deb hisoblanadi. Ammo uni Xitoyni ham, Hindistonning ayrim chegara hududlarini ham bosib olgan mo'g'ollar ancha oldin, ehtimol XIII asr o'rtalarida ham joriy etgan bo'lishi mumkin. Katta bir mo'g'ol imperiyasining birlashishi Xitoy texnologiyasining Hindistonning mo'g'ollar tomonidan bosib olingan qismlariga erkin o'tkazilishiga olib keldi. Shunga qaramay, mo'g'ollar Hindistonga bostirib kirganlarida Xitoyga o'q-po'drat qurollaridan foydalangan deb hisoblanadi. Tarix-i Firishta (16061607) da mo'g'ollar hukmron Huligu elchiga 1258 yilda Dehliga kelganida ajoyib pyrotexnika taqdim etilganligi yozilgan. Birinchi o'q-po'drat texnologiyasini mo'g'ollar tomonidan Hindistonga o'q-po'drat qo'yishdi.
1765 yil Stamp Act (qisqa nom Amerika koloniyalarida majburiyatlar to'g'risidagi qonun 1765; 5 George III, c. 12) - Buyuk Britaniya parlamenti qonunidir, u Britaniya Amerika koloniyalariga to'g'ridan-to'g'ri soliq solgan va koloniyalardagi ko'plab bosma materiallar Londonda ishlab chiqarilgan bosma qog'ozda ishlab chiqarilishi kerak edi, bu bosma qog'ozda daromad sumkasi bor edi.[1][2] Bosma materiallar yuridik hujjatlar, jurnallar, o'yin kartalari, gazetalar va ko'plab boshqa qog'ozlarni o'z ichiga olgan. Oldingi soliqlar kabi, bosma soliq to'lovning maqsadi kolonial qog'oz pulda emas, balki amalda Britaniya valyutasida to'lanishi kerak edi.
qonun va tartib boʻyicha oʻldirilgan Ada kim edi?
Aleksandra Borgiya Borgiya Law & Order franchisasi tarixidagi eng qisqa ishtirok etgan yordamchi tuman prokurori edi, u faqat 33 ta epizodda ko'rinadi. Oila qotilligini tekshirishda prokurorlik idorasi er Frank Andreasga e'tibor qaratadi, u qotillarga uyga bostirib kiruvchi talon-torojlarni sodir etish uchun ishlatiladigan soxta DEA belgilari bilan ta'minlaydi. Borgiya Andreasga uning sheriklarini tashlashga bosim o'tkazadi va keyinchalik o'z xonadoniga o'g'irlandi. Uning jasadi keyinchalik tashlab qo'yilgan mashinaning bagazida topilgan, bog'langan, shafqatsiz urilgan va o'zini bo'g'ib qo'yganidan so'ng asfiksiyadan o'lgan. Ajablanayotgan McCoy o'zining qotillarini qamoq qilish uchun soxta ayblovni tashkil etadi, qonuniy axloqiy ahlakni o'zgartiradi.
Harry Potter (qarakteri) Harry Potter va o'lim marosimlarida Harry, Ron va Hermione Hogwartsdan chiqib, Dumbledore vazifasini bajaradilar: Voldemortning qolgan to'rtta Horcruxini qidirish va yo'q qilish, keyin Qorong'i Lordni topish va o'ldirish. Uch kishi Voldemortning yangi tashkil etilgan totalitar politsiya davlatiga qarshi o'zlarini qo'yishadi, bu harakat Xarrining jasorati va axloqiy xarakterini sinaydi. Voldemortning sehr vazirligini egallashi propaganda va qo'rquv bilan rag'batlantirilgan Muggle-bo'ralarga qarshi diskriminatorlik va genotsid siyosatiga olib keladi. J. K. Rowlingning aytishicha, Harri Cruciatus va Imperius Curse, azob-uqubat va ongni nazorat qilish uchun kechirilmas la'natlardan foydalanadi.
2018 yilgi kollej futboli playofflari milliy chempionati 2018 yilgi kollej futboli playofflari milliy chempionati - bu 2017 yilgi mavsum uchun NCAA I futbol Bowl bo'limidagi milliy chempionni belgilaydigan kollej futboli bo'l o'yinidir. Bu o'yin 2018 yil 8 yanvar kuni Georgia shtatining Atlanta shahridagi Mercedes-Benz stadionida o'ynatiladi. Uch yillik aylanish doirasida o'yin 2018 yil 1 yanvar kuni o'ynaydigan ikki yarim final bo'l o'yinlarining g'oliblari o'rtasida o'ynatiladi: Rose Bowl o'yin va Sugar Bowl. Ushbu ikki o'yinda ishtirokchilar 2017 yilgi muntazam mavsum yakunidan so'ng aniqlanadi.
- Loss:
MultipleNegativesRankingLoss
with these parameters:{ "scale": 20.0, "similarity_fct": "cos_sim", "gather_across_devices": false }
Training Hyperparameters
Non-Default Hyperparameters
eval_strategy
: stepsfp16
: Truemulti_dataset_batch_sampler
: round_robin
All Hyperparameters
Click to expand
overwrite_output_dir
: Falsedo_predict
: Falseeval_strategy
: stepsprediction_loss_only
: Trueper_device_train_batch_size
: 8per_device_eval_batch_size
: 8per_gpu_train_batch_size
: Noneper_gpu_eval_batch_size
: Nonegradient_accumulation_steps
: 1eval_accumulation_steps
: Nonetorch_empty_cache_steps
: Nonelearning_rate
: 5e-05weight_decay
: 0.0adam_beta1
: 0.9adam_beta2
: 0.999adam_epsilon
: 1e-08max_grad_norm
: 1num_train_epochs
: 3max_steps
: -1lr_scheduler_type
: linearlr_scheduler_kwargs
: {}warmup_ratio
: 0.0warmup_steps
: 0log_level
: passivelog_level_replica
: warninglog_on_each_node
: Truelogging_nan_inf_filter
: Truesave_safetensors
: Truesave_on_each_node
: Falsesave_only_model
: Falserestore_callback_states_from_checkpoint
: Falseno_cuda
: Falseuse_cpu
: Falseuse_mps_device
: Falseseed
: 42data_seed
: Nonejit_mode_eval
: Falseuse_ipex
: Falsebf16
: Falsefp16
: Truefp16_opt_level
: O1half_precision_backend
: autobf16_full_eval
: Falsefp16_full_eval
: Falsetf32
: Nonelocal_rank
: 0ddp_backend
: Nonetpu_num_cores
: Nonetpu_metrics_debug
: Falsedebug
: []dataloader_drop_last
: Falsedataloader_num_workers
: 0dataloader_prefetch_factor
: Nonepast_index
: -1disable_tqdm
: Falseremove_unused_columns
: Truelabel_names
: Noneload_best_model_at_end
: Falseignore_data_skip
: Falsefsdp
: []fsdp_min_num_params
: 0fsdp_config
: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}fsdp_transformer_layer_cls_to_wrap
: Noneaccelerator_config
: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}parallelism_config
: Nonedeepspeed
: Nonelabel_smoothing_factor
: 0.0optim
: adamw_torch_fusedoptim_args
: Noneadafactor
: Falsegroup_by_length
: Falselength_column_name
: lengthddp_find_unused_parameters
: Noneddp_bucket_cap_mb
: Noneddp_broadcast_buffers
: Falsedataloader_pin_memory
: Truedataloader_persistent_workers
: Falseskip_memory_metrics
: Trueuse_legacy_prediction_loop
: Falsepush_to_hub
: Falseresume_from_checkpoint
: Nonehub_model_id
: Nonehub_strategy
: every_savehub_private_repo
: Nonehub_always_push
: Falsehub_revision
: Nonegradient_checkpointing
: Falsegradient_checkpointing_kwargs
: Noneinclude_inputs_for_metrics
: Falseinclude_for_metrics
: []eval_do_concat_batches
: Truefp16_backend
: autopush_to_hub_model_id
: Nonepush_to_hub_organization
: Nonemp_parameters
:auto_find_batch_size
: Falsefull_determinism
: Falsetorchdynamo
: Noneray_scope
: lastddp_timeout
: 1800torch_compile
: Falsetorch_compile_backend
: Nonetorch_compile_mode
: Noneinclude_tokens_per_second
: Falseinclude_num_input_tokens_seen
: Falseneftune_noise_alpha
: Noneoptim_target_modules
: Nonebatch_eval_metrics
: Falseeval_on_start
: Falseuse_liger_kernel
: Falseliger_kernel_config
: Noneeval_use_gather_object
: Falseaverage_tokens_across_devices
: Falseprompts
: Nonebatch_sampler
: batch_samplermulti_dataset_batch_sampler
: round_robinrouter_mapping
: {}learning_rate_mapping
: {}
Training Logs
Epoch | Step | Training Loss | cosine_ndcg@10 |
---|---|---|---|
0.2222 | 500 | 0.4649 | 0.6259 |
0.4444 | 1000 | 0.5086 | 0.5681 |
0.6667 | 1500 | 0.5243 | 0.6237 |
0.8889 | 2000 | 0.5062 | 0.6097 |
1.0 | 2250 | - | 0.5946 |
1.1111 | 2500 | 0.3389 | 0.6567 |
1.3333 | 3000 | 0.1844 | 0.6175 |
1.5556 | 3500 | 0.1605 | 0.6577 |
1.7778 | 4000 | 0.144 | 0.6864 |
2.0 | 4500 | 0.1451 | 0.6871 |
2.2222 | 5000 | 0.0263 | 0.7154 |
2.4444 | 5500 | 0.0312 | 0.7324 |
2.6667 | 6000 | 0.0279 | 0.7329 |
Framework Versions
- Python: 3.13.5
- Sentence Transformers: 5.1.0
- Transformers: 4.56.1
- PyTorch: 2.8.0+cu128
- Accelerate: 1.9.0
- Datasets: 2.19.1
- Tokenizers: 0.22.0
Citation
BibTeX
Sentence Transformers
@inproceedings{reimers-2019-sentence-bert,
title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
author = "Reimers, Nils and Gurevych, Iryna",
booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
month = "11",
year = "2019",
publisher = "Association for Computational Linguistics",
url = "https://arxiv.org/abs/1908.10084",
}
MultipleNegativesRankingLoss
@misc{henderson2017efficient,
title={Efficient Natural Language Response Suggestion for Smart Reply},
author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
year={2017},
eprint={1705.00652},
archivePrefix={arXiv},
primaryClass={cs.CL}
}
- Downloads last month
- 37
Model tree for Saidakmal/uz_embeddinggemma-300m
Base model
google/embeddinggemma-300mDataset used to train Saidakmal/uz_embeddinggemma-300m
Evaluation results
- Cosine Accuracy@1 on Unknownself-reported0.598
- Cosine Accuracy@3 on Unknownself-reported0.762
- Cosine Accuracy@5 on Unknownself-reported0.811
- Cosine Accuracy@10 on Unknownself-reported0.865
- Cosine Precision@1 on Unknownself-reported0.598
- Cosine Precision@3 on Unknownself-reported0.254
- Cosine Precision@5 on Unknownself-reported0.162
- Cosine Precision@10 on Unknownself-reported0.086
- Cosine Recall@1 on Unknownself-reported0.598
- Cosine Recall@3 on Unknownself-reported0.762