Upload README.md with huggingface_hub
Browse files
README.md
CHANGED
|
@@ -63157,7 +63157,7 @@ library_name: sentence-transformers
|
|
| 63157 |
|
| 63158 |
# BGE-M3 fine-tuned with Matryoshka + MNRLoss
|
| 63159 |
|
| 63160 |
-
This is a [sentence-transformers](https://www.SBERT.net) model finetuned from [BAAI/bge-m3](https://huggingface.co/BAAI/bge-m3) on the json dataset. It maps sentences & paragraphs to a 1024-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
|
| 63161 |
|
| 63162 |
## Model Details
|
| 63163 |
|
|
@@ -63203,7 +63203,7 @@ Then you can load this model and run inference.
|
|
| 63203 |
from sentence_transformers import SentenceTransformer
|
| 63204 |
|
| 63205 |
# Download from the 🤗 Hub
|
| 63206 |
-
model = SentenceTransformer("Yesimm/InfectaVec-
|
| 63207 |
# Run inference
|
| 63208 |
sentences = [
|
| 63209 |
'최근 몇 년간 SFTS의 발생 추세는 어떤가요?',
|
|
@@ -63422,6 +63422,34 @@ You can finetune this model on your own dataset.
|
|
| 63422 |
- Datasets: 4.0.0
|
| 63423 |
- Tokenizers: 0.21.1
|
| 63424 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 63425 |
## Citation
|
| 63426 |
|
| 63427 |
### BibTeX
|
|
|
|
| 63157 |
|
| 63158 |
# BGE-M3 fine-tuned with Matryoshka + MNRLoss
|
| 63159 |
|
| 63160 |
+
This is a [sentence-transformers](https://www.SBERT.net) model finetuned from [BAAI/bge-m3](https://huggingface.co/BAAI/bge-m3) on the json dataset. It maps sentences & paragraphs to a 1024-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more. The main difference with InfectaVec-v1 model is that InfectaVec-v2 is trained with paraphrased and bitext mined queries (En to Kr, Kr to En).
|
| 63161 |
|
| 63162 |
## Model Details
|
| 63163 |
|
|
|
|
| 63203 |
from sentence_transformers import SentenceTransformer
|
| 63204 |
|
| 63205 |
# Download from the 🤗 Hub
|
| 63206 |
+
model = SentenceTransformer("Yesimm/InfectaVec-v1")
|
| 63207 |
# Run inference
|
| 63208 |
sentences = [
|
| 63209 |
'최근 몇 년간 SFTS의 발생 추세는 어떤가요?',
|
|
|
|
| 63422 |
- Datasets: 4.0.0
|
| 63423 |
- Tokenizers: 0.21.1
|
| 63424 |
|
| 63425 |
+
|
| 63426 |
+
### Evaluation Results
|
| 63427 |
+
## Evaluation Results on Infectious Diseases Test Dataset
|
| 63428 |
+
|
| 63429 |
+
| Model | Epoch | Accuracy(@1) | Recall(@1) | Precision(@10) | NDCG(@10) | MRR(@10) | MAP(@100) |
|
| 63430 |
+
|--------------|-------|-------------|-----------|----------------|-----------|----------|-----------|
|
| 63431 |
+
| **BGE-M3** | - | 44.49 | 44.49 | 7.43 | 58.97 | 54.12 | 54.83 |
|
| 63432 |
+
| **InfectaVec v1** | 2 | 62.67 | 62.67 | 9.37 | 78.21 | 73.23 | 73.58 |
|
| 63433 |
+
| | 3 | 62.30 | 62.30 | 9.42 | 78.58 | 73.52 | 73.85 |
|
| 63434 |
+
| | 4 | 62.83 | 62.83 | 9.46 | 78.92 | 73.87 | 74.18 |
|
| 63435 |
+
| **InfectaVec v2** | 2 | 59.49 | 59.49 | 9.08 | 75.35 | 70.37 | 70.81 |
|
| 63436 |
+
| | 3 | 61.08 | 61.08 | 9.16 | 76.43 | 71.55 | 71.98 |
|
| 63437 |
+
| | 4 | 61.29 | 61.29 | 9.16 | 76.49 | 71.63 | 72.07 |
|
| 63438 |
+
|
| 63439 |
+
|
| 63440 |
+
## Evaluation Results on MTEB Medical Benchmarks for Retrieval, Clustering and Semantic Text Similarity Tasks
|
| 63441 |
+
| Models | PublicHealthQA (Kr) | PublicHealthQA (En) | MedrxivClusteringS2S.v2 (En) | BIOSSES (En) |
|
| 63442 |
+
|---------------------|-------------------|--------------------|-----------------------------|-------------|
|
| 63443 |
+
| **BGE-M3** | 80.41 | 83.81 | 30.63 | 83.38 |
|
| 63444 |
+
| **Multilingual e5-large** | 85.14 | 84.57 | 39.14 | 87.45 |
|
| 63445 |
+
| **InfectaVec-v1** | 79.70 | 82.57 | 34.62 | 79.37 |
|
| 63446 |
+
| **Qwen-3 Embedding-0.6B** | 81.10 | 83.84 | 40.38 | 84.73 |
|
| 63447 |
+
| **InfectaVec-v2** | 82.36 | 84.85 | 34.23 | 76.51 |
|
| 63448 |
+
|
| 63449 |
+
## Citation
|
| 63450 |
+
|
| 63451 |
+
### BibTeX
|
| 63452 |
+
|
| 63453 |
## Citation
|
| 63454 |
|
| 63455 |
### BibTeX
|