svjack commited on
Commit
5fe379f
·
verified ·
1 Parent(s): 7334af3

Delete README.md

Browse files
Files changed (1) hide show
  1. README.md +0 -207
README.md DELETED
@@ -1,207 +0,0 @@
1
- ---
2
- library_name: setfit
3
- tags:
4
- - setfit
5
- - sentence-transformers
6
- - text-classification
7
- - generated_from_setfit_trainer
8
- metrics:
9
- - accuracy
10
- widget:
11
- - text: ' 从文中可以看出,关于无影人的描述并未出现,因此无法从文中抽取相关语段来回答这个问题。'
12
- - text: ' 从文中可以找到这样一个语段:"直到某一天,迷途的冒险家发现了他们。无影人惊奇地发现这名冒险家有一个亦步亦趋的追随者,寡言且忠实。" 这段描述了无影人是如何发现自己存在的,冒险家发现了这个没有影子的族群,而无影人是其中之一。因此,答案是:冒险家发现了无影人并告诉他们存在的存在。'
13
- - text: ' 从给出的叙述中,没有明确提到风雪是否有所消退的相关语段。'
14
- - text: ' 从文章中,我们可以找到关于年轻人(姆)失去的信息是,他们去了璃月港务工,每月寄回钱给家人,但是看到城市繁华和便利,可能永远不会回来生活了。因此,年轻人失去了与家人和轻策山庄的常居和生活方式。'
15
- - text: ' 从文章中,没有明确提到帕西法尔少爷的继承顺位或者他为什么要推进它。只有渔船老板提到人世的规矩,强调没有人做自己不喜欢的工作才能成事。'
16
- pipeline_tag: text-classification
17
- inference: true
18
- base_model: BAAI/bge-small-zh-v1.5
19
- model-index:
20
- - name: SetFit with BAAI/bge-small-zh-v1.5
21
- results:
22
- - task:
23
- type: text-classification
24
- name: Text Classification
25
- dataset:
26
- name: Unknown
27
- type: unknown
28
- split: test
29
- metrics:
30
- - type: accuracy
31
- value: 1.0
32
- name: Accuracy
33
- ---
34
-
35
- # SetFit with BAAI/bge-small-zh-v1.5
36
-
37
- This is a [SetFit](https://github.com/huggingface/setfit) model that can be used for Text Classification. This SetFit model uses [BAAI/bge-small-zh-v1.5](https://huggingface.co/BAAI/bge-small-zh-v1.5) as the Sentence Transformer embedding model. A [LogisticRegression](https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LogisticRegression.html) instance is used for classification.
38
-
39
- The model has been trained using an efficient few-shot learning technique that involves:
40
-
41
- 1. Fine-tuning a [Sentence Transformer](https://www.sbert.net) with contrastive learning.
42
- 2. Training a classification head with features from the fine-tuned Sentence Transformer.
43
-
44
- ## Model Details
45
-
46
- ### Model Description
47
- - **Model Type:** SetFit
48
- - **Sentence Transformer body:** [BAAI/bge-small-zh-v1.5](https://huggingface.co/BAAI/bge-small-zh-v1.5)
49
- - **Classification head:** a [LogisticRegression](https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LogisticRegression.html) instance
50
- - **Maximum Sequence Length:** 512 tokens
51
- - **Number of Classes:** 2 classes
52
- <!-- - **Training Dataset:** [Unknown](https://huggingface.co/datasets/unknown) -->
53
- <!-- - **Language:** Unknown -->
54
- <!-- - **License:** Unknown -->
55
-
56
- ### Model Sources
57
-
58
- - **Repository:** [SetFit on GitHub](https://github.com/huggingface/setfit)
59
- - **Paper:** [Efficient Few-Shot Learning Without Prompts](https://arxiv.org/abs/2209.11055)
60
- - **Blogpost:** [SetFit: Efficient Few-Shot Learning Without Prompts](https://huggingface.co/blog/setfit)
61
-
62
- ### Model Labels
63
- | Label | Examples |
64
- |:------|:-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
65
- | 0 | <ul><li>' 从文本中,没有明确提到���西法尔少爷推进继承顺位的原因。相关的语段仅描述了爱薇艾小姐引用了国王亚尔杰代伊的话,但是人们没有表达出什么感受,然后转向了下周的舞会。拉塔尔勋爵准备讲述关于高塔、巫师和玻璃球的传说,但是克里克先生打断了他。因此,没有包含"问题: 为什么帕西法尔少爷忍不住要将自己的继承顺位向前推进?"相关的语段。'</li><li>' 从文章中,我没有看到包含 "问题: 如果回到蒙德,那么什么路上的绊石只剩一个了?" 这句话的相关语段。因此,我无法提供答案。'</li><li>' 从给出的叙述中,没有明确提到失血和严寒对心智产生什么影响。'</li></ul> |
66
- | 1 | <ul><li>' 从文中可以找到以下相关语段:\n\n1. "无影人中的一人说,「梦?我们的人已经很久不会做梦。」"\n2. "「梦?你们的人已经很久不会做梦。」"\n3. "「梦魇比你所想象的更狡猾。当它们发现你的所为,就会蜂拥而起,将你拖入无光之境。在那里没有影子的边界,你无法离开。」"\n\n从这些语段中可以得出,无影人是一个没有梦的人,而梦是藏有灵魂秘密的东西。梦魇是一个更狡猾的存在,如果发现无影人的行动,会将其拖入无光之境,这个地方没有影子边界,无法离开。因此,无影人的能力是无梦和无影。'</li><li>' 年轻的魔法师失去了被大贤人收为徒弟的机会。'</li><li>' 由于先前发生的事件使我们失去了前一本日志,可能无法恢复它。因此,我们最终还是没有能够打开那扇大门。尽管壁画和英戈伯特老爷期待的古代武器等等,但最终都失败了。当我们回到雪山阳面的营地时,同侪中的某人还没有回来。虽然我们希望他们顺利下山,带着补给和救兵回来,但现在我们的补给已经不足了。虽然这可能很残酷,但密室的圆形大门前的塌方不仅夺走了尼克,还夺走了我们委托尼克保管的燃料和食物。虽然我们曾经说过先勘探遗迹结构的完整性。\n\n这几天的经历可能让我变得更加冷酷了。可能是绝望的环境对人造成了影响。但是厄伯哈特少爷却让人钦佩,即使遇到了这些事情,还能保持冷静思考的能力。即使只是私生子,他也是一个能配得上一族名字的人。\n\n我们将在风雪稍微消散之后,按照厄伯哈特少爷的建议,去西南侧的遗迹地窖去。根据他的解释,这里独特的严寒可能有着很久以前留下来的东西。虽然很不可思议,但这里特殊的严寒有着保存物资的能力。\n\n问题:为什么今天决定不去勘探西南面的遗迹地窖,而是去有着闭锁的密室?\n答案:由于我们失去了前一本日志,可能无法恢复它,导致我们无法确定西南面的遗迹地窖的情况。同时,密室前有着夺走了我们补给和燃料的塌方,使我们面临着不足够补给和燃料的问题。因此,我们决定去密室中寻找帮助。'</li></ul> |
67
-
68
- ## Evaluation
69
-
70
- ### Metrics
71
- | Label | Accuracy |
72
- |:--------|:---------|
73
- | **all** | 1.0 |
74
-
75
- ## Uses
76
-
77
- ### Direct Use for Inference
78
-
79
- First install the SetFit library:
80
-
81
- ```bash
82
- pip install setfit
83
- ```
84
-
85
- Then you can load this model and run inference.
86
-
87
- ```python
88
- from setfit import SetFitModel
89
-
90
- # Download from the 🤗 Hub
91
- model = SetFitModel.from_pretrained("setfit_model_id")
92
- # Run inference
93
- preds = model(" 从给出的叙述中,没有明确提到风雪是否有所消退的相关语段。")
94
- ```
95
-
96
- <!--
97
- ### Downstream Use
98
-
99
- *List how someone could finetune this model on their own dataset.*
100
- -->
101
-
102
- <!--
103
- ### Out-of-Scope Use
104
-
105
- *List how the model may foreseeably be misused and address what users ought not to do with the model.*
106
- -->
107
-
108
- <!--
109
- ## Bias, Risks and Limitations
110
-
111
- *What are the known or foreseeable issues stemming from this model? You could also flag here known failure cases or weaknesses of the model.*
112
- -->
113
-
114
- <!--
115
- ### Recommendations
116
-
117
- *What are recommendations with respect to the foreseeable issues? For example, filtering explicit content.*
118
- -->
119
-
120
- ## Training Details
121
-
122
- ### Training Set Metrics
123
- | Training set | Min | Median | Max |
124
- |:-------------|:----|:-------|:----|
125
- | Word count | 2 | 7.0 | 138 |
126
-
127
- | Label | Training Sample Count |
128
- |:------|:----------------------|
129
- | 0 | 65 |
130
- | 1 | 65 |
131
-
132
- ### Training Hyperparameters
133
- - batch_size: (16, 16)
134
- - num_epochs: (1, 1)
135
- - max_steps: -1
136
- - sampling_strategy: oversampling
137
- - body_learning_rate: (2e-05, 1e-05)
138
- - head_learning_rate: 0.01
139
- - loss: CosineSimilarityLoss
140
- - distance_metric: cosine_distance
141
- - margin: 0.25
142
- - end_to_end: False
143
- - use_amp: False
144
- - warmup_proportion: 0.1
145
- - seed: 42
146
- - eval_max_steps: -1
147
- - load_best_model_at_end: True
148
-
149
- ### Training Results
150
- | Epoch | Step | Training Loss | Validation Loss |
151
- |:-------:|:-------:|:-------------:|:---------------:|
152
- | 0.0019 | 1 | 0.238 | - |
153
- | 0.0931 | 50 | 0.1207 | - |
154
- | 0.1862 | 100 | 0.0126 | - |
155
- | 0.2793 | 150 | 0.005 | - |
156
- | 0.3724 | 200 | 0.0035 | - |
157
- | 0.4655 | 250 | 0.0028 | - |
158
- | 0.5587 | 300 | 0.0029 | - |
159
- | 0.6518 | 350 | 0.0027 | - |
160
- | 0.7449 | 400 | 0.0039 | - |
161
- | 0.8380 | 450 | 0.0028 | - |
162
- | 0.9311 | 500 | 0.0028 | - |
163
- | **1.0** | **537** | **-** | **0.0004** |
164
-
165
- * The bold row denotes the saved checkpoint.
166
- ### Framework Versions
167
- - Python: 3.10.12
168
- - SetFit: 1.0.3
169
- - Sentence Transformers: 2.4.0
170
- - Transformers: 4.38.1
171
- - PyTorch: 2.0.1+cu118
172
- - Datasets: 2.17.1
173
- - Tokenizers: 0.15.2
174
-
175
- ## Citation
176
-
177
- ### BibTeX
178
- ```bibtex
179
- @article{https://doi.org/10.48550/arxiv.2209.11055,
180
- doi = {10.48550/ARXIV.2209.11055},
181
- url = {https://arxiv.org/abs/2209.11055},
182
- author = {Tunstall, Lewis and Reimers, Nils and Jo, Unso Eun Seo and Bates, Luke and Korat, Daniel and Wasserblat, Moshe and Pereg, Oren},
183
- keywords = {Computation and Language (cs.CL), FOS: Computer and information sciences, FOS: Computer and information sciences},
184
- title = {Efficient Few-Shot Learning Without Prompts},
185
- publisher = {arXiv},
186
- year = {2022},
187
- copyright = {Creative Commons Attribution 4.0 International}
188
- }
189
- ```
190
-
191
- <!--
192
- ## Glossary
193
-
194
- *Clearly define terms in order to be accessible across audiences.*
195
- -->
196
-
197
- <!--
198
- ## Model Card Authors
199
-
200
- *Lists the people who create the model card, providing recognition and accountability for the detailed work that goes into its construction.*
201
- -->
202
-
203
- <!--
204
- ## Model Card Contact
205
-
206
- *Provides a way for people who have updates to the Model Card, suggestions, or questions, to contact the Model Card authors.*
207
- -->