MilosKosRad commited on
Commit
cd7918b
·
verified ·
1 Parent(s): a3e48f3

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +67 -0
README.md ADDED
@@ -0,0 +1,67 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language: sr
3
+ metrics:
4
+ - accuracy
5
+ - precision
6
+ - recall
7
+ - f1
8
+ base_model: microsoft/deberta-v3-large
9
+ ---
10
+ # srbNLI: Serbian Natural Language Inference Model
11
+
12
+ ## Model Overview
13
+ srbNLI is a fine-tuned Natural Language Inference (NLI) model for Serbian, created by adapting the SciFact dataset. The model is based on state-of-the-art transformer architectures. It is trained to recognize relationships between claims and evidence in Serbian text, with applications in scientific claim verification and potential expansion to broader claim verification tasks.
14
+
15
+ ## Key Details
16
+ - **Model Type**: Transformer-based
17
+ - **Language**: Serbian
18
+ - **Task**: Natural Language Inference (NLI), Textual Entailment, Claim Verification
19
+ - **Dataset**: srbSciFact (automatically translated SciFact dataset)
20
+ - **Fine-tuning**: Fine-tuned on Serbian NLI data (support, contradiction, and neutral categories).
21
+ - **Metrics**: Accuracy, Precision, Recall, F1-score
22
+
23
+ ## Motivation
24
+ This model addresses the lack of NLI datasets and models for Serbian, a low-resource language. It provides a tool for textual entailment and claim verification, especially for scientific claims, with broader potential for misinformation detection and automated fact-checking.
25
+
26
+ ## Training
27
+ - **Base Models Used**: DeBERTa-v3-large
28
+ - **Training Data**: Automatically translated SciFact dataset
29
+ - **Fine-tuning**: Conducted on a single DGX NVIDIA A100 GPU (40 GB)
30
+ - **Hyperparameters**: Optimized learning rate, batch size, weight decay, epochs, and early stopping
31
+
32
+ ## Evaluation
33
+ The model was evaluated using standard NLI metrics (accuracy, precision, recall, F1-score). It was also compared to the GPT-4o model for generalization capabilities.
34
+
35
+ ## Use Cases
36
+ - **Claim Verification**: Scientific claims and general domain claims in Serbian
37
+ - **Misinformation Detection**: Identifying contradictions or support between claims and evidence
38
+ - **Cross-lingual Applications**: Potential for cross-lingual claim verification with multilingual models
39
+
40
+ ## Future Work
41
+ - Improving accuracy with human-corrected translations and Serbian-specific datasets
42
+ - Expanding to general-domain claim verification
43
+ - Enhancing multilingual NLI capabilities
44
+
45
+ ## Results Comparison
46
+
47
+ The table below presents a comparison of the fine-tuned models (DeBERTa-v3-large, RoBERTa-large, BERTić, GPT-4o, and others) on the srbSciFact dataset, focusing on key metrics: Accuracy (Acc), Precision (P), Recall (R), and F1-score (F1). The models were evaluated on their ability to classify relationships between claims and evidence in Serbian text.
48
+
49
+ | Model | Accuracy | Precision (P) | Recall (R) | F1-score (F1) |
50
+ |----------------------|----------|---------------|------------|---------------|
51
+ | **DeBERTa-v3-large** | 0.70 | 0.86 | 0.82 | 0.84 |
52
+ | **RoBERTa-large** | 0.57 | 0.63 | 0.76 | 0.69 |
53
+ | **BERTić (Serbian)** | 0.56 | 0.56 | 0.37 | 0.44 |
54
+ | **GPT-4o (English)** | 0.66 | 0.70 | 0.77 | 0.78 |
55
+ | **mDeBERTa-base** | 0.63 | 0.92 | 0.75 | 0.83 |
56
+ | **XLM-RoBERTa-large** | 0.64 | 0.89 | 0.77 | 0.83 |
57
+ | **mBERT-cased** | 0.48 | 0.76 | 0.50 | 0.60 |
58
+ | **mBERT-uncased** | 0.57 | 0.45 | 0.61 | 0.52 |
59
+
60
+ ### Observations
61
+ - **DeBERTa-v3-large** performed the best overall, with an accuracy of 0.70 and an F1-score of 0.84.
62
+ - **RoBERTa-large** and **BERTić** showed lower performance, especially in recall, suggesting challenges in handling complex linguistic inference in Serbian.
63
+ - **GPT-4o** outperforms all fine-tuned models in F1-score when the prompt is in English, but the **DeBERTa-v3-large** model slightly outperforms GPT-4o when the prompt is in Serbian.
64
+ - **mDeBERTa-base** and **XLM-RoBERTa-large** exhibited strong cross-lingual performance, with F1-scores of 0.83 and 0.83, respectively.
65
+
66
+ This demonstrates the potential of adapting advanced transformer models to Serbian while highlighting areas for future improvement, such as refining translations and expanding domain-specific data.
67
+ ---