xuandin commited on
Commit
dbda129
·
verified ·
1 Parent(s): d5e135c

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +276 -1
README.md CHANGED
@@ -23,7 +23,7 @@ tags:
23
  - **Task:** Binary Classification (Fact Verification)
24
  - **Dataset:** [ViWikiFC](https://arxiv.org/abs/2405.07615)
25
 
26
- SemViQA-BC is one of the key components of the two-step classification approach in the SemViQA system. It focuses on binary classification, determining whether a claim is SUPPORTED or REFUTED. This step follows an initial three-class classification, where claims are first categorized as SUPPORTED, REFUTED, or NOT ENOUGH INFORMATION (NEI). By incorporating Cross-Entropy Loss and Focal Loss, SemViQA-BC enhances precision in claim verification, ensuring more accurate fact-checking results
27
 
28
  ## Usage Example
29
 
@@ -69,6 +69,281 @@ for i, (label, prob) in enumerate(zip(labels, probabilities.tolist()), start=1):
69
  # 2) REFUTED 0.9999
70
  ```
71
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
72
  ## **Citation**
73
 
74
  If you use **SemViQA-BC** in your research, please cite:
 
23
  - **Task:** Binary Classification (Fact Verification)
24
  - **Dataset:** [ViWikiFC](https://arxiv.org/abs/2405.07615)
25
 
26
+ SemViQA-BC is one of the key components of the two-step classification (TVC) approach in the SemViQA system. It focuses on binary classification, determining whether a claim is SUPPORTED or REFUTED. This step follows an initial three-class classification, where claims are first categorized as SUPPORTED, REFUTED, or NOT ENOUGH INFORMATION (NEI). By incorporating Cross-Entropy Loss and Focal Loss, SemViQA-BC enhances precision in claim verification, ensuring more accurate fact-checking results
27
 
28
  ## Usage Example
29
 
 
69
  # 2) REFUTED 0.9999
70
  ```
71
 
72
+ ## **Evaluation Results**
73
+
74
+ SemViQA-BC achieved impressive results on the test set, demonstrating accurate and efficient classification capabilities. The detailed evaluation of SemViQA-BC is presented in the table below.
75
+
76
+ <table>
77
+ <thead>
78
+ <tr>
79
+ <th colspan="2">Method</th>
80
+ <th colspan="4">ViWikiFC</th>
81
+ </tr>
82
+ <tr>
83
+ <th>ER</th>
84
+ <th>VC</th>
85
+ <th>Strict Acc</th>
86
+ <th>VC Acc</th>
87
+ <th>ER Acc</th>
88
+ <th>Time (s)</th>
89
+ </tr>
90
+ </thead>
91
+ <tbody>
92
+ <tr>
93
+ <td rowspan="3">TF-IDF</td>
94
+ <td>InfoXLM<sub>large</sub></td>
95
+ <td>75.56</td>
96
+ <td>82.21</td>
97
+ <td>90.15</td>
98
+ <td>131</td>
99
+ </tr>
100
+ <tr>
101
+ <td>XLM-R<sub>large</sub></td>
102
+ <td>76.47</td>
103
+ <td>82.78</td>
104
+ <td>90.15</td>
105
+ <td>134</td>
106
+ </tr>
107
+ <tr>
108
+ <td>Ernie-M<sub>large</sub></td>
109
+ <td>75.56</td>
110
+ <td>81.83</td>
111
+ <td>90.15</td>
112
+ <td>144</td>
113
+ </tr>
114
+ <tr>
115
+ <td rowspan="3">BM25</td>
116
+ <td>InfoXLM<sub>large</sub></td>
117
+ <td>70.44</td>
118
+ <td>79.01</td>
119
+ <td>83.50</td>
120
+ <td>130</td>
121
+ </tr>
122
+ <tr>
123
+ <td>XLM-R<sub>large</sub></td>
124
+ <td>70.97</td>
125
+ <td>78.91</td>
126
+ <td>83.50</td>
127
+ <td>132</td>
128
+ </tr>
129
+ <tr>
130
+ <td>Ernie-M<sub>large</sub></td>
131
+ <td>70.21</td>
132
+ <td>78.29</td>
133
+ <td>83.50</td>
134
+ <td>141</td>
135
+ </tr>
136
+ <tr>
137
+ <td rowspan="3">SBert</td>
138
+ <td>InfoXLM<sub>large</sub></td>
139
+ <td>74.99</td>
140
+ <td>81.59</td>
141
+ <td>89.72</td>
142
+ <td>195</td>
143
+ </tr>
144
+ <tr>
145
+ <td>XLM-R<sub>large</sub></td>
146
+ <td>75.80</td>
147
+ <td>82.35</td>
148
+ <td>89.72</td>
149
+ <td>194</td>
150
+ </tr>
151
+ <tr>
152
+ <td>Ernie-M<sub>large</sub></td>
153
+ <td>75.13</td>
154
+ <td>81.44</td>
155
+ <td>89.72</td>
156
+ <td>203</td>
157
+ </tr>
158
+ <tr>
159
+ <th colspan="1">QA-based approaches</th>
160
+ <th colspan="1">VC</th>
161
+ <th colspan="4"></th>
162
+ </tr>
163
+ <tr>
164
+ <td rowspan="3">ViMRC<sub>large</sub></td>
165
+ <td>InfoXLM<sub>large</sub></td>
166
+ <td>77.28</td>
167
+ <td>81.97</td>
168
+ <td>92.49</td>
169
+ <td>3778</td>
170
+ </tr>
171
+ <tr>
172
+ <td>XLM-R<sub>large</sub></td>
173
+ <td>78.29</td>
174
+ <td>82.83</td>
175
+ <td>92.49</td>
176
+ <td>3824</td>
177
+ </tr>
178
+ <tr>
179
+ <td>Ernie-M<sub>large</sub></td>
180
+ <td>77.38</td>
181
+ <td>81.92</td>
182
+ <td>92.49</td>
183
+ <td>3785</td>
184
+ </tr>
185
+ <tr>
186
+ <td rowspan="3">InfoXLM<sub>large</sub></td>
187
+ <td>InfoXLM<sub>large</sub></td>
188
+ <td>78.14</td>
189
+ <td>82.07</td>
190
+ <td>93.45</td>
191
+ <td>4092</td>
192
+ </tr>
193
+ <tr>
194
+ <td>XLM-R<sub>large</sub></td>
195
+ <td>79.20</td>
196
+ <td>83.07</td>
197
+ <td>93.45</td>
198
+ <td>4096</td>
199
+ </tr>
200
+ <tr>
201
+ <td>Ernie-M<sub>large</sub></td>
202
+ <td>78.24</td>
203
+ <td>82.21</td>
204
+ <td>93.45</td>
205
+ <td>4102</td>
206
+ </tr>
207
+ <tr>
208
+ <th colspan="2">LLM</th>
209
+ <th colspan="4"></th>
210
+ </tr>
211
+ <tr>
212
+ <td colspan="2">Qwen2.5-1.5B-Instruct</td>
213
+ <td>51.03</td>
214
+ <td>65.18</td>
215
+ <td>78.96</td>
216
+ <td>7665</td>
217
+ </tr>
218
+ <tr>
219
+ <td colspan="2">Qwen2.5-3B-Instruct</td>
220
+ <td>44.38</td>
221
+ <td>62.31</td>
222
+ <td>71.35</td>
223
+ <td>12123</td>
224
+ </tr>
225
+ <tr>
226
+ <th colspan="1">LLM</th>
227
+ <th colspan="1">VC</th>
228
+ <th colspan="4"></th>
229
+ </tr>
230
+ <tr>
231
+ <td rowspan="3">Qwen2.5-1.5B-Instruct</td>
232
+ <td>InfoXLM<sub>large</sub></td>
233
+ <td>66.14</td>
234
+ <td>76.47</td>
235
+ <td>78.96</td>
236
+ <td>7788</td>
237
+ </tr>
238
+ <tr>
239
+ <td>XLM-R<sub>large</sub></td>
240
+ <td>67.67</td>
241
+ <td>78.10</td>
242
+ <td>78.96</td>
243
+ <td>7789</td>
244
+ </tr>
245
+ <tr>
246
+ <td>Ernie-M<sub>large</sub></td>
247
+ <td>66.52</td>
248
+ <td>76.52</td>
249
+ <td>78.96</td>
250
+ <td>7794</td>
251
+ </tr>
252
+ <tr>
253
+ <td rowspan="3">Qwen2.5-3B-Instruct</td>
254
+ <td>InfoXLM<sub>large</sub></td>
255
+ <td>59.88</td>
256
+ <td>72.50</td>
257
+ <td>71.35</td>
258
+ <td>12246</td>
259
+ </tr>
260
+ <tr>
261
+ <td>XLM-R<sub>large</sub></td>
262
+ <td>60.74</td>
263
+ <td>73.08</td>
264
+ <td>71.35</td>
265
+ <td>12246</td>
266
+ </tr>
267
+ <tr>
268
+ <td>Ernie-M<sub>large</sub></td>
269
+ <td>60.02</td>
270
+ <td>72.21</td>
271
+ <td>71.35</td>
272
+ <td>12251</td>
273
+ </tr>
274
+ <tr>
275
+ <th colspan="1">SER Faster (ours)</th>
276
+ <th colspan="1">TVC (ours)</th>
277
+ <th colspan="4"></th>
278
+ </tr>
279
+ <tr>
280
+ <td>TF-IDF + ViMRC<sub>large</sub></td>
281
+ <td>Ernie-M<sub>large</sub></td>
282
+ <td style="color:blue">79.44</td>
283
+ <td style="color:blue">82.93</td>
284
+ <td style="color:blue">94.60</td>
285
+ <td style="color:blue">410</td>
286
+ </tr>
287
+ <tr>
288
+ <td>TF-IDF + InfoXLM<sub>large</sub></td>
289
+ <td>Ernie-M<sub>large</sub></td>
290
+ <td style="color:blue">79.77</td>
291
+ <td style="color:blue">83.07</td>
292
+ <td style="color:blue">95.03</td>
293
+ <td style="color:blue">487</td>
294
+ </tr>
295
+ <tr>
296
+ <th colspan="1">SER (ours)</th>
297
+ <th colspan="1">TVC (ours)</th>
298
+ <th colspan="4"></th>
299
+ </tr>
300
+ <tr>
301
+ <td rowspan="3">TF-IDF + ViMRC<sub>large</sub></td>
302
+ <td>InfoXLM<sub>large</sub></td>
303
+ <td>80.25</td>
304
+ <td>83.84</td>
305
+ <td>94.69</td>
306
+ <td>2731</td>
307
+ </tr>
308
+ <tr>
309
+ <td>XLM-R<sub>large</sub></td>
310
+ <td>80.34</td>
311
+ <td>83.64</td>
312
+ <td>94.69</td>
313
+ <td>2733</td>
314
+ </tr>
315
+ <tr>
316
+ <td>Ernie-M<sub>large</sub></td>
317
+ <td>79.53</td>
318
+ <td>82.97</td>
319
+ <td>94.69</td>
320
+ <td>2733</td>
321
+ </tr>
322
+ <tr>
323
+ <td rowspan="3">TF-IDF + InfoXLM<sub>large</sub></td>
324
+ <td>InfoXLM<sub>large</sub></td>
325
+ <td>80.68</td>
326
+ <td><strong>83.98</strong></td>
327
+ <td><strong>95.31</strong></td>
328
+ <td>3860</td>
329
+ </tr>
330
+ <tr>
331
+ <td>XLM-R<sub>large</sub></td>
332
+ <td><strong>80.82</strong></td>
333
+ <td>83.88</td>
334
+ <td><strong>95.31</strong></td>
335
+ <td>3843</td>
336
+ </tr>
337
+ <tr>
338
+ <td>Ernie-M<sub>large</sub></td>
339
+ <td>80.06</td>
340
+ <td>83.17</td>
341
+ <td><strong>95.31</strong></td>
342
+ <td>3891</td>
343
+ </tr>
344
+ </tbody>
345
+ </table>
346
+
347
  ## **Citation**
348
 
349
  If you use **SemViQA-BC** in your research, please cite: