Improve model card: Update license, add languages, expand sections, and include usage example

#1
by nielsr HF Staff - opened
Files changed (1) hide show
  1. README.md +170 -14
README.md CHANGED
@@ -1,25 +1,50 @@
1
  ---
 
 
2
  library_name: transformers
3
- tags:
4
- - generated_from_trainer
 
 
 
 
 
 
 
 
 
5
  metrics:
6
  - accuracy
7
  - f1
 
 
 
 
 
 
8
  model-index:
9
  - name: mdeberta-v3-base-subjectivity-sentiment-multilingual-no-arabic
10
  results: []
11
- license: mit
12
- base_model:
13
- - microsoft/mdeberta-v3-base
14
- pipeline_tag: text-classification
 
 
 
 
 
 
 
 
15
  ---
16
 
17
- <!-- This model card has been generated automatically according to the information the Trainer had access to. You
18
- should probably proofread and complete it, then remove this comment. -->
19
-
20
  # mdeberta-v3-base-subjectivity-sentiment-multilingual-no-arabic
21
 
22
- This model is a fine-tuned version of [microsoft/mdeberta-v3-base](https://huggingface.co/microsoft/mdeberta-v3-base) on the [CheckThat! Lab Task 1 Subjectivity Detection at CLEF 2025](arxiv.org/abs/2507.11764).
 
 
 
23
  It achieves the following results on the evaluation set:
24
  - Loss: 0.8900
25
  - Macro F1: 0.7969
@@ -32,15 +57,41 @@ It achieves the following results on the evaluation set:
32
 
33
  ## Model description
34
 
35
- More information needed
 
 
 
 
 
 
 
36
 
37
  ## Intended uses & limitations
38
 
39
- More information needed
 
 
 
 
 
 
 
 
 
 
 
 
 
 
40
 
41
  ## Training and evaluation data
42
 
43
- More information needed
 
 
 
 
 
44
 
45
  ## Training procedure
46
 
@@ -72,4 +123,109 @@ The following hyperparameters were used during training:
72
  - Transformers 4.47.0
73
  - Pytorch 2.5.1+cu121
74
  - Datasets 3.3.1
75
- - Tokenizers 0.21.0
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
+ base_model:
3
+ - microsoft/mdeberta-v3-base
4
  library_name: transformers
5
+ license: cc-by-4.0
6
+ languages:
7
+ - ar
8
+ - de
9
+ - en
10
+ - it
11
+ - bg
12
+ - el
13
+ - pl
14
+ - ro
15
+ - uk
16
  metrics:
17
  - accuracy
18
  - f1
19
+ pipeline_tag: text-classification
20
+ tags:
21
+ - generated_from_trainer
22
+ - deberta
23
+ - multilingual
24
+ - subjectivity-detection
25
  model-index:
26
  - name: mdeberta-v3-base-subjectivity-sentiment-multilingual-no-arabic
27
  results: []
28
+ datasets:
29
+ - MatteoFasulo/clef2025_checkthat_task1_subjectivity
30
+ language:
31
+ - ar
32
+ - de
33
+ - bg
34
+ - el
35
+ - it
36
+ - ro
37
+ - uk
38
+ - en
39
+ - pl
40
  ---
41
 
 
 
 
42
  # mdeberta-v3-base-subjectivity-sentiment-multilingual-no-arabic
43
 
44
+ This model is a fine-tuned version of [microsoft/mdeberta-v3-base](https://huggingface.co/microsoft/mdeberta-v3-base) for **Subjectivity Detection in News Articles**, as presented in the paper [AI Wizards at CheckThat! 2025: Enhancing Transformer-Based Embeddings with Sentiment for Subjectivity Detection in News Articles](https://huggingface.co/papers/2507.11764). It was developed by AI Wizards for their participation in the CLEF 2025 CheckThat! Lab Task 1.
45
+
46
+ **Code:** [https://github.com/MatteoFasulo/clef2025-checkthat](https://github.com/MatteoFasulo/clef2025-checkthat)
47
+
48
  It achieves the following results on the evaluation set:
49
  - Loss: 0.8900
50
  - Macro F1: 0.7969
 
57
 
58
  ## Model description
59
 
60
+ This model is designed to classify sentences as **subjective** (opinion-laden) or **objective** in news articles. This task is a key component in combating misinformation, improving fact-checking pipelines, and supporting journalists.
61
+
62
+ The primary strategy of this model, based on `mDeBERTaV3-base`, involves enhancing transformer-based embeddings by integrating sentiment scores derived from an auxiliary model. This sentiment-augmented architecture significantly boosts performance, especially the subjective F1 score. The model also employs decision threshold calibration, optimized on the development set, to effectively address class imbalance prevalent across the datasets.
63
+
64
+ **Key Contributions from the paper:**
65
+ 1. **Sentiment-Augmented Fine-Tuning**: Enriches embedding-based models by integrating sentiment scores from an auxiliary model, notably improving subjective sentence detection.
66
+ 2. **Diverse Model Coverage**: Benchmarked across multilingual BERT variants (like mDeBERTaV3-base) for all official CLEF languages.
67
+ 3. **Threshold Calibration for Imbalance**: A simple yet effective method to tune decision thresholds on each language's development data to enhance macro-F1 performance.
68
 
69
  ## Intended uses & limitations
70
 
71
+ **Intended Uses:**
72
+ This model is intended for subjectivity detection in news articles, classifying sentences as subjective or objective. It has been evaluated across:
73
+ * **Monolingual settings:** Arabic, German, English, Italian, and Bulgarian.
74
+ * **Zero-shot transfer:** Tested on unseen languages such as Greek, Polish, Romanian, and Ukrainian.
75
+ * **Multilingual training.**
76
+
77
+ Its capabilities make it suitable for applications that require discerning factual statements from opinions, such as:
78
+ * Combating misinformation by identifying biased or opinionated content.
79
+ * Improving automated fact-checking pipelines.
80
+ * Assisting journalists in content analysis and report generation.
81
+
82
+ **Limitations:**
83
+ * The model's performance is optimized for news articles; its effectiveness might vary on text from other domains (e.g., social media, conversational text) with different linguistic characteristics or content styles.
84
+ * While decision threshold calibration is used to mitigate class imbalance, extreme imbalances in specific datasets might still affect performance.
85
+ * The paper notes an initial submission error where an incorrect train/dev mix was used, leading to under-calibrated thresholds and lower initial scores in the multilingual track. While corrected results are reported, this highlights sensitivity to data preparation and calibration.
86
 
87
  ## Training and evaluation data
88
 
89
+ The model was fine-tuned on datasets provided for the **CLEF 2025 CheckThat! Lab Task 1: Subjectivity Detection in News Articles**.
90
+
91
+ * **Training and development datasets** were provided for monolingual settings in Arabic, German, English, Italian, and Bulgarian.
92
+ * **Final evaluation** included additional unseen languages such as Greek, Romanian, Polish, and Ukrainian to assess the model's generalization capabilities.
93
+
94
+ The training process involved enhancing sentence representations by integrating sentiment scores from an auxiliary model. Class imbalance, which was a notable characteristic across these languages, was addressed through decision threshold calibration during training.
95
 
96
  ## Training procedure
97
 
 
123
  - Transformers 4.47.0
124
  - Pytorch 2.5.1+cu121
125
  - Datasets 3.3.1
126
+ - Tokenizers 0.21.0
127
+
128
+ ## How to use
129
+
130
+ You can use this model directly with the Hugging Face `transformers` library for text classification:
131
+
132
+ ```python
133
+ import torch
134
+ import torch.nn as nn
135
+ from transformers import DebertaV2Model, DebertaV2Config, AutoTokenizer, PreTrainedModel, pipeline, AutoModelForSequenceClassification
136
+ from transformers.models.deberta.modeling_deberta import ContextPooler
137
+
138
+ sent_pipe = pipeline(
139
+ "sentiment-analysis",
140
+ model="cardiffnlp/twitter-xlm-roberta-base-sentiment",
141
+ tokenizer="cardiffnlp/twitter-xlm-roberta-base-sentiment",
142
+ top_k=None, # return all 3 sentiment scores
143
+ )
144
+
145
+ class CustomModel(PreTrainedModel):
146
+ config_class = DebertaV2Config
147
+ def __init__(self, config, sentiment_dim=3, num_labels=2, *args, **kwargs):
148
+ super().__init__(config, *args, **kwargs)
149
+ self.deberta = DebertaV2Model(config)
150
+ self.pooler = ContextPooler(config)
151
+ output_dim = self.pooler.output_dim
152
+ self.dropout = nn.Dropout(0.1)
153
+ self.classifier = nn.Linear(output_dim + sentiment_dim, num_labels)
154
+
155
+ def forward(self, input_ids, positive, neutral, negative, token_type_ids=None, attention_mask=None, labels=None):
156
+ outputs = self.deberta(input_ids=input_ids, attention_mask=attention_mask)
157
+ encoder_layer = outputs[0]
158
+ pooled_output = self.pooler(encoder_layer)
159
+ sentiment_features = torch.stack((positive, neutral, negative), dim=1).to(pooled_output.dtype)
160
+ combined_features = torch.cat((pooled_output, sentiment_features), dim=1)
161
+ logits = self.classifier(self.dropout(combined_features))
162
+ return {'logits': logits}
163
+
164
+ model_name = "MatteoFasulo/mdeberta-v3-base-subjectivity-sentiment-multilingual-no-arabic"
165
+ tokenizer = AutoTokenizer.from_pretrained("microsoft/mdeberta-v3-base")
166
+ config = DebertaV2Config.from_pretrained(
167
+ model_name,
168
+ num_labels=2,
169
+ id2label={0: 'OBJ', 1: 'SUBJ'},
170
+ label2id={'OBJ': 0, 'SUBJ': 1},
171
+ output_attentions=False,
172
+ output_hidden_states=False
173
+ )
174
+ model = CustomModel(config=config, sentiment_dim=3, num_labels=2).from_pretrained(model_name)
175
+
176
+ def classify_subjectivity(text: str):
177
+ # get full sentiment distribution
178
+ dist = sent_pipe(text)[0]
179
+ pos = next(d["score"] for d in dist if d["label"] == "positive")
180
+ neu = next(d["score"] for d in dist if d["label"] == "neutral")
181
+ neg = next(d["score"] for d in dist if d["label"] == "negative")
182
+
183
+ # tokenize the text
184
+ inputs = tokenizer(text, padding=True, truncation=True, max_length=256, return_tensors='pt')
185
+
186
+ # feeding in the three sentiment scores
187
+ with torch.no_grad():
188
+ outputs = model(
189
+ input_ids=inputs["input_ids"],
190
+ attention_mask=inputs["attention_mask"],
191
+ positive=torch.tensor(pos).unsqueeze(0).float(),
192
+ neutral=torch.tensor(neu).unsqueeze(0).float(),
193
+ negative=torch.tensor(neg).unsqueeze(0).float()
194
+ )
195
+
196
+ # compute probabilities and pick the top label
197
+ probs = torch.softmax(outputs.get('logits')[0], dim=-1)
198
+ label = model.config.id2label[int(probs.argmax())]
199
+ score = probs.max().item()
200
+
201
+ return {"label": label, "score": score}
202
+
203
+ examples = [
204
+ "The company reported a 10% increase in revenue for the last quarter.",
205
+ "Die angegebenen Fehlerquoten können daher nur für symptomatische Patienten gelten.",
206
+ "Si smonta qui definitivamente la narrazione per cui le scelte energetiche possono essere frutto esclusivo di valutazioni “tecniche” e non politiche.",
207
+ ]
208
+ for text in examples:
209
+ result = classify_subjectivity(text)
210
+ print(f"Text: {text}")
211
+ print(f"→ Subjectivity: {result['label']} (score={result['score']:.2f})\n")
212
+
213
+ ```
214
+
215
+ For more detailed usage, including training and evaluation scripts, please refer to the [GitHub repository](https://github.com/MatteoFasulo/clef2025-checkthat).
216
+
217
+ ## Citation
218
+
219
+ If you find our work helpful or inspiring, please feel free to cite it:
220
+
221
+ ```bibtex
222
+ @misc{fasulo2025aiwizardscheckthat2025,
223
+ title={AI Wizards at CheckThat! 2025: Enhancing Transformer-Based Embeddings with Sentiment for Subjectivity Detection in News Articles},
224
+ author={Matteo Fasulo and Luca Babboni and Luca Tedeschini},
225
+ year={2025},
226
+ eprint={2507.11764},
227
+ archivePrefix={arXiv},
228
+ primaryClass={cs.CL},
229
+ url={https://arxiv.org/abs/2507.11764},
230
+ }
231
+ ```