w601sxs commited on
Commit
df177cc
·
1 Parent(s): fbe0925

updated readme

Browse files
.ipynb_checkpoints/README-checkpoint.md ADDED
The diff for this file is too large to render. See raw diff
 
.ipynb_checkpoints/config-checkpoint.json ADDED
@@ -0,0 +1,26 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "_name_or_path": "./merged_model",
3
+ "architectures": [
4
+ "BertModel"
5
+ ],
6
+ "attention_probs_dropout_prob": 0.1,
7
+ "classifier_dropout": null,
8
+ "gradient_checkpointing": false,
9
+ "hidden_act": "gelu",
10
+ "hidden_dropout_prob": 0.1,
11
+ "hidden_size": 1024,
12
+ "initializer_range": 0.02,
13
+ "intermediate_size": 4096,
14
+ "layer_norm_eps": 1e-12,
15
+ "max_position_embeddings": 512,
16
+ "model_type": "bert",
17
+ "num_attention_heads": 16,
18
+ "num_hidden_layers": 24,
19
+ "pad_token_id": 0,
20
+ "position_embedding_type": "absolute",
21
+ "torch_dtype": "float32",
22
+ "transformers_version": "4.40.2",
23
+ "type_vocab_size": 2,
24
+ "use_cache": true,
25
+ "vocab_size": 30522
26
+ }
README.md CHANGED
@@ -2897,6 +2897,16 @@ model-index:
2897
  value: 78.62958187511512
2898
  ---
2899
 
 
 
 
 
 
 
 
 
 
 
2900
  To use this model:
2901
  ```
2902
  from transformers import AutoTokenizer, AutoModel
@@ -2904,4 +2914,29 @@ from transformers import AutoTokenizer, AutoModel
2904
  tokenizer = AutoTokenizer.from_pretrained("w601sxs/b1ade-embed")
2905
  ```
2906
 
2907
- b1ade-embed is part of a collection of small models for RAG.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
2897
  value: 78.62958187511512
2898
  ---
2899
 
2900
+ `b1ade-embed` is a small but efficient embedding model for RAG. In the legacy MTEB leaderboard ( - 2024) b1ade-embed was ranked #1 in the STS catagory and placed competitively for other important task categories such as ranking, retrieval and classification. The model was trained using a combination of:
2901
+
2902
+ 1. Model merging
2903
+ - bert-large-uncased
2904
+ - WhereIsAI/UAE-Large-V1
2905
+ - BAAI/bge-large-en-v1.5
2906
+ - mixedbread-ai/mxbai-embed-large-v1
2907
+ - avsolatorio/GIST-large-Embedding-v0)
2908
+ 2. Knowledge distillation from larger models
2909
+
2910
  To use this model:
2911
  ```
2912
  from transformers import AutoTokenizer, AutoModel
 
2914
  tokenizer = AutoTokenizer.from_pretrained("w601sxs/b1ade-embed")
2915
  ```
2916
 
2917
+ b1ade-embed is part of a collection of small models for RAG. Stay tuned for more updates.
2918
+
2919
+ ## Use in research
2920
+
2921
+ Our embedding model "b1ade-embed" is a 335M parameter model that demonstrates strong performance across the board. Specifically, recent research used the model in clinical and labor market domains, relying on the #1 ranking of the model in Semantic Textual Similarity (STS) for models under 500M parameters on the MTEB leaderboard.
2922
+
2923
+ We've been working on b1ade-embed to optimize the balance between latency and performance. This balance is crucial in real-world applications, especially in verticalized domains, where rapid processing of vast amounts of data can significantly impact decision-making processes. While achieving high accuracy is important, the ability to deliver results quickly is equally vital. Larger embedding outputs also result in higher storage costs in vector indexes, so striking a balance in between task performance and latency is important.
2924
+
2925
+ The medRxiv paper, "A Scalable Framework for Benchmarking Embedding Models for Clinical Tasks," provides a comprehensive evaluation of embedding models in healthcare contexts. It tested 30 models across various clinical tasks (2.1M comparisons), including analysis of patient notes, synthetic EHRs, and MIMIC-IV ICU data, as well as biomedical tasks involving PubMed abstracts and research papers. The study highlights b1ade-embed's versatility across these domains:
2926
+
2927
+ "Other models exhibiting strong performance in both clinical and PubMed domains include 'b1ade-embed'." It also emphasizes the model's efficiency, noting that "Models like 'b1ade-embed' demonstrate high efficiency despite smaller size, making them ideal for tasks requiring rapid processing." The paper evaluated models on short tasks such as triage notes and chief complaints, where b1ade-embed achieved a high score of 27.4, competing closely with larger models.
2928
+
2929
+ In the labor market context, the CEUR-WS paper demonstrates b1ade-embed's effectiveness in taxonomy enrichment. The paper states, "We evaluated the robustness of our system against a closed-world evaluation constructed using ESCO's hierarchy, achieving a 81% Positive Predictive Value (PPV) when combining all three models." This high accuracy demonstrates b1ade-embed's capability to capture nuanced semantic relationships in labor market terminology. Of course, no model can be 👑. There is a need to carefully evaluate task performance vs latency for your specific embedding task - STS, retrieval, clustering etc.
2930
+
2931
+
2932
+ ## Cite
2933
+
2934
+ ```
2935
+ @misc{bigscience_workshop_2022,
2936
+ author = { {Shreyas Subramanian} },
2937
+ title = { {b1ade series of models} },
2938
+ year = 2024,
2939
+ url = { https://huggingface.co/w601sxs/b1ade-embed },
2940
+ publisher = { Hugging Face }
2941
+ }
2942
+ ```