xuandin commited on
Commit
f0421e6
·
verified ·
1 Parent(s): 4da9769

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +275 -0
README.md CHANGED
@@ -71,6 +71,281 @@ for i, (label, prob) in enumerate(zip(labels, probabilities.tolist()), start=1):
71
  # 2) REFUTED 0.9981
72
  ```
73
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
74
  ## **Citation**
75
 
76
  If you use **SemViQA-BC** in your research, please cite:
 
71
  # 2) REFUTED 0.9981
72
  ```
73
 
74
+ ## **Evaluation Results**
75
+
76
+ SemViQA-BC achieved impressive results on the test set, demonstrating accurate and efficient classification capabilities. The detailed evaluation of SemViQA-BC is presented in the table below.
77
+
78
+ <table>
79
+ <thead>
80
+ <tr>
81
+ <th colspan="2">Method</th>
82
+ <th colspan="4">ISE-DSC01</th>
83
+ </tr>
84
+ <tr>
85
+ <th>ER</th>
86
+ <th>VC</th>
87
+ <th>Strict Acc</th>
88
+ <th>VC Acc</th>
89
+ <th>ER Acc</th>
90
+ <th>Time (s)</th>
91
+ </tr>
92
+ </thead>
93
+ <tbody>
94
+ <tr>
95
+ <td rowspan="3">TF-IDF</td>
96
+ <td>InfoXLM<sub>large</sub></td>
97
+ <td>73.59</td>
98
+ <td>78.08</td>
99
+ <td>76.61</td>
100
+ <td>378</td>
101
+ </tr>
102
+ <tr>
103
+ <td>XLM-R<sub>large</sub></td>
104
+ <td>75.61</td>
105
+ <td>80.50</td>
106
+ <td>78.58</td>
107
+ <td>366</td>
108
+ </tr>
109
+ <tr>
110
+ <td>Ernie-M<sub>large</sub></td>
111
+ <td>78.19</td>
112
+ <td>81.69</td>
113
+ <td>80.65</td>
114
+ <td>403</td>
115
+ </tr>
116
+ <tr>
117
+ <td rowspan="3">BM25</td>
118
+ <td>InfoXLM<sub>large</sub></td>
119
+ <td>72.09</td>
120
+ <td>77.37</td>
121
+ <td>75.04</td>
122
+ <td>320</td>
123
+ </tr>
124
+ <tr>
125
+ <td>XLM-R<sub>large</sub></td>
126
+ <td>73.94</td>
127
+ <td>79.37</td>
128
+ <td>76.95</td>
129
+ <td>333</td>
130
+ </tr>
131
+ <tr>
132
+ <td>Ernie-M<sub>large</sub></td>
133
+ <td>76.58</td>
134
+ <td>80.76</td>
135
+ <td>79.02</td>
136
+ <td>381</td>
137
+ </tr>
138
+ <tr>
139
+ <td rowspan="3">SBert</td>
140
+ <td>InfoXLM<sub>large</sub></td>
141
+ <td>71.20</td>
142
+ <td>76.59</td>
143
+ <td>74.15</td>
144
+ <td>915</td>
145
+ </tr>
146
+ <tr>
147
+ <td>XLM-R<sub>large</sub></td>
148
+ <td>72.85</td>
149
+ <td>78.78</td>
150
+ <td>75.89</td>
151
+ <td>835</td>
152
+ </tr>
153
+ <tr>
154
+ <td>Ernie-M<sub>large</sub></td>
155
+ <td>75.46</td>
156
+ <td>79.89</td>
157
+ <td>77.91</td>
158
+ <td>920</td>
159
+ </tr>
160
+ <tr>
161
+ <th colspan="1">QA-based approaches</th>
162
+ <th colspan="1">VC</th>
163
+ <th colspan="4"></th>
164
+ </tr>
165
+ <tr>
166
+ <td rowspan="3">ViMRC<sub>large</sub></td>
167
+ <td>InfoXLM<sub>large</sub></td>
168
+ <td>54.36</td>
169
+ <td>64.14</td>
170
+ <td>56.84</td>
171
+ <td>9798</td>
172
+ </tr>
173
+ <tr>
174
+ <td>XLM-R<sub>large</sub></td>
175
+ <td>53.98</td>
176
+ <td>66.70</td>
177
+ <td>57.77</td>
178
+ <td>9809</td>
179
+ </tr>
180
+ <tr>
181
+ <td>Ernie-M<sub>large</sub></td>
182
+ <td>56.62</td>
183
+ <td>62.19</td>
184
+ <td>58.91</td>
185
+ <td>9833</td>
186
+ </tr>
187
+ <tr>
188
+ <td rowspan="3">InfoXLM<sub>large</sub></td>
189
+ <td>InfoXLM<sub>large</sub></td>
190
+ <td>53.50</td>
191
+ <td>63.83</td>
192
+ <td>56.17</td>
193
+ <td>10057</td>
194
+ </tr>
195
+ <tr>
196
+ <td>XLM-R<sub>large</sub></td>
197
+ <td>53.32</td>
198
+ <td>66.70</td>
199
+ <td>57.25</td>
200
+ <td>10066</td>
201
+ </tr>
202
+ <tr>
203
+ <td>Ernie-M<sub>large</sub></td>
204
+ <td>56.34</td>
205
+ <td>62.36</td>
206
+ <td>58.69</td>
207
+ <td>10078</td>
208
+ </tr>
209
+ <tr>
210
+ <th colspan="2">LLM</th>
211
+ <th colspan="4"></th>
212
+ </tr>
213
+ <tr>
214
+ <td colspan="2">Qwen2.5-1.5B-Instruct</td>
215
+ <td>59.23</td>
216
+ <td>66.68</td>
217
+ <td>65.51</td>
218
+ <td>19780</td>
219
+ </tr>
220
+ <tr>
221
+ <td colspan="2">Qwen2.5-3B-Instruct</td>
222
+ <td>60.87</td>
223
+ <td>66.92</td>
224
+ <td>66.10</td>
225
+ <td>31284</td>
226
+ </tr>
227
+ <tr>
228
+ <th colspan="1">LLM</th>
229
+ <th colspan="1">VC</th>
230
+ <th colspan="4"></th>
231
+ </tr>
232
+ <tr>
233
+ <td rowspan="3">Qwen2.5-1.5B-Instruct</td>
234
+ <td>InfoXLM<sub>large</sub></td>
235
+ <td>64.40</td>
236
+ <td>68.37</td>
237
+ <td>66.49</td>
238
+ <td>19970</td>
239
+ </tr>
240
+ <tr>
241
+ <td>XLM-R<sub>large</sub></td>
242
+ <td>64.66</td>
243
+ <td>69.63</td>
244
+ <td>66.72</td>
245
+ <td>19976</td>
246
+ </tr>
247
+ <tr>
248
+ <td>Ernie-M<sub>large</sub></td>
249
+ <td>65.70</td>
250
+ <td>68.37</td>
251
+ <td>67.33</td>
252
+ <td>20003</td>
253
+ </tr>
254
+ <tr>
255
+ <td rowspan="3">Qwen2.5-3B-Instruct</td>
256
+ <td>InfoXLM<sub>large</sub></td>
257
+ <td>65.72</td>
258
+ <td>69.66</td>
259
+ <td>67.51</td>
260
+ <td>31477</td>
261
+ </tr>
262
+ <tr>
263
+ <td>XLM-R<sub>large</sub></td>
264
+ <td>66.12</td>
265
+ <td>70.44</td>
266
+ <td>67.83</td>
267
+ <td>31483</td>
268
+ </tr>
269
+ <tr>
270
+ <td>Ernie-M<sub>large</sub></td>
271
+ <td>67.48</td>
272
+ <td>70.77</td>
273
+ <td>68.75</td>
274
+ <td>31512</td>
275
+ </tr>
276
+ <tr>
277
+ <th colspan="1">SER Faster (ours)</th>
278
+ <th colspan="1">TVC (ours)</th>
279
+ <th colspan="4"></th>
280
+ </tr>
281
+ <tr>
282
+ <td>TF-IDF + ViMRC<sub>large</sub></td>
283
+ <td>Ernie-M<sub>large</sub></td>
284
+ <td style="color:blue">78.32</td>
285
+ <td style="color:blue">81.91</td>
286
+ <td style="color:blue">80.26</td>
287
+ <td style="color:blue">995</td>
288
+ </tr>
289
+ <tr>
290
+ <td>TF-IDF + InfoXLM<sub>large</sub></td>
291
+ <td>Ernie-M<sub>large</sub></td>
292
+ <td style="color:blue">78.37</td>
293
+ <td style="color:blue">81.91</td>
294
+ <td style="color:blue">80.32</td>
295
+ <td style="color:blue">925</td>
296
+ </tr>
297
+ <tr>
298
+ <th colspan="1">SER (ours)</th>
299
+ <th colspan="1">TVC (ours)</th>
300
+ <th colspan="4"></th>
301
+ </tr>
302
+ <tr>
303
+ <td rowspan="3">TF-IDF + ViMRC<sub>large</sub></td>
304
+ <td>InfoXLM<sub>large</sub></td>
305
+ <td>75.13</td>
306
+ <td>79.54</td>
307
+ <td>76.87</td>
308
+ <td>5191</td>
309
+ </tr>
310
+ <tr>
311
+ <td>XLM-R<sub>large</sub></td>
312
+ <td>76.71</td>
313
+ <td>81.65</td>
314
+ <td>78.91</td>
315
+ <td>5219</td>
316
+ </tr>
317
+ <tr>
318
+ <td>Ernie-M<sub>large</sub></td>
319
+ <td><strong>78.97</strong></td>
320
+ <td><strong>82.54</strong></td>
321
+ <td><strong>80.91</strong></td>
322
+ <td>5225</td>
323
+ </tr>
324
+ <tr>
325
+ <td rowspan="3">TF-IDF + InfoXLM<sub>large</sub></td>
326
+ <td>InfoXLM<sub>large</sub></td>
327
+ <td>75.13</td>
328
+ <td>79.60</td>
329
+ <td>76.87</td>
330
+ <td>5175</td>
331
+ </tr>
332
+ <tr>
333
+ <td>XLM-R<sub>large</sub></td>
334
+ <td>76.74</td>
335
+ <td>81.71</td>
336
+ <td>78.95</td>
337
+ <td>5200</td>
338
+ </tr>
339
+ <tr>
340
+ <td>Ernie-M<sub>large</sub></td>
341
+ <td><strong>78.97</strong></td>
342
+ <td>82.49</td>
343
+ <td><strong>80.91</strong></td>
344
+ <td>5297</td>
345
+ </tr>
346
+ </tbody>
347
+ </table>
348
+
349
  ## **Citation**
350
 
351
  If you use **SemViQA-BC** in your research, please cite: