hpprc commited on
Commit
29f63a3
·
1 Parent(s): f3b4ba3

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +21 -20
README.md CHANGED
@@ -7,29 +7,30 @@ tags:
7
  - transformers
8
  datasets:
9
  - shunk031/jsnli
 
 
 
10
  ---
11
 
12
- # {MODEL_NAME}
13
 
14
- This is a [sentence-transformers](https://www.SBERT.net) model: It maps sentences & paragraphs to a 1024 dimensional dense vector space and can be used for tasks like clustering or semantic search.
15
 
16
- <!--- Describe your model here -->
17
 
18
  ## Usage (Sentence-Transformers)
19
 
20
  Using this model becomes easy when you have [sentence-transformers](https://www.SBERT.net) installed:
21
 
22
  ```
23
- pip install -U sentence-transformers
24
  ```
25
 
26
  Then you can use the model like this:
27
 
28
  ```python
29
  from sentence_transformers import SentenceTransformer
30
- sentences = ["This is an example sentence", "Each sentence is converted"]
31
 
32
- model = SentenceTransformer('{MODEL_NAME}')
33
  embeddings = model.encode(sentences)
34
  print(embeddings)
35
  ```
@@ -52,8 +53,8 @@ def cls_pooling(model_output, attention_mask):
52
  sentences = ['This is an example sentence', 'Each sentence is converted']
53
 
54
  # Load model from HuggingFace Hub
55
- tokenizer = AutoTokenizer.from_pretrained('{MODEL_NAME}')
56
- model = AutoModel.from_pretrained('{MODEL_NAME}')
57
 
58
  # Tokenize sentences
59
  encoded_input = tokenizer(sentences, padding=True, truncation=True, return_tensors='pt')
@@ -69,24 +70,24 @@ print("Sentence embeddings:")
69
  print(sentence_embeddings)
70
  ```
71
 
72
-
73
-
74
- ## Evaluation Results
75
-
76
- <!--- Describe how your model was evaluated -->
77
-
78
- For an automated evaluation of this model, see the *Sentence Embeddings Benchmark*: [https://seb.sbert.net](https://seb.sbert.net?model_name={MODEL_NAME})
79
-
80
-
81
-
82
  ## Full Model Architecture
83
  ```
84
  SentenceTransformer(
85
  (0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: BertModel
86
- (1): Pooling({'word_embedding_dimension': 1024, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False})
87
  )
88
  ```
89
 
90
  ## Citing & Authors
91
 
92
- <!--- Describe where people can find more information -->
 
 
 
 
 
 
 
 
 
 
 
7
  - transformers
8
  datasets:
9
  - shunk031/jsnli
10
+ license: cc-by-sa-4.0
11
+ language:
12
+ - ja
13
  ---
14
 
 
15
 
16
+ # sup-simcse-ja-large
17
 
 
18
 
19
  ## Usage (Sentence-Transformers)
20
 
21
  Using this model becomes easy when you have [sentence-transformers](https://www.SBERT.net) installed:
22
 
23
  ```
24
+ pip install -U fugashi[unidic-lite] sentence-transformers
25
  ```
26
 
27
  Then you can use the model like this:
28
 
29
  ```python
30
  from sentence_transformers import SentenceTransformer
31
+ sentences = ["こんにちは、世界!", "文埋め込み最高!文埋め込み最高と叫びなさい", "極度乾燥しなさい"]
32
 
33
+ model = SentenceTransformer("sup-simcse-ja-large")
34
  embeddings = model.encode(sentences)
35
  print(embeddings)
36
  ```
 
53
  sentences = ['This is an example sentence', 'Each sentence is converted']
54
 
55
  # Load model from HuggingFace Hub
56
+ tokenizer = AutoTokenizer.from_pretrained("sup-simcse-ja-large")
57
+ model = AutoModel.from_pretrained("sup-simcse-ja-large")
58
 
59
  # Tokenize sentences
60
  encoded_input = tokenizer(sentences, padding=True, truncation=True, return_tensors='pt')
 
70
  print(sentence_embeddings)
71
  ```
72
 
 
 
 
 
 
 
 
 
 
 
73
  ## Full Model Architecture
74
  ```
75
  SentenceTransformer(
76
  (0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: BertModel
77
+ (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False})
78
  )
79
  ```
80
 
81
  ## Citing & Authors
82
 
83
+ ```
84
+ @misc{
85
+ hayato-tsukagoshi-2023-simple-simcse-ja,
86
+ author = {Hayato Tsukagoshi},
87
+ title = {Japanese Simple-SimCSE},
88
+ year = {2023},
89
+ publisher = {GitHub},
90
+ journal = {GitHub repository},
91
+ howpublished = {\url{https://github.com/hppRC/simple-simcse-ja}}
92
+ }
93
+ ```