MatteoFasulo
/

mdeberta-v3-base-subjectivity-sentiment-multilingual-no-arabic

@@ -25,6 +25,18 @@ tags:
 model-index:
 - name: mdeberta-v3-base-subjectivity-sentiment-multilingual-no-arabic
   results: []
 ---
 # mdeberta-v3-base-subjectivity-sentiment-multilingual-no-arabic
@@ -118,35 +130,102 @@ The following hyperparameters were used during training:
 You can use this model directly with the Hugging Face `transformers` library for text classification:
 ```python
-from transformers import pipeline
-model_name = "MatteoFasulo/mdeberta-v3-base-subjectivity-sentiment-multilingual-no-arabic"
-classifier = pipeline("text-classification", model=model_name)
-# Example usage
-text_subjective = "This is a truly amazing product and I highly recommend it!"
-result_subjective = classifier(text_subjective)
-print(f"'{text_subjective}' -> {result_subjective}")
-# Expected output: [{'label': 'SUBJ', 'score': 0.99...}]
-text_objective = "The capital of France is Paris."
-result_objective = classifier(text_objective)
-print(f"'{text_objective}' -> {result_objective}")
-# Expected output: [{'label': 'OBJ', 'score': 0.98...}]
 ```
 For more detailed usage, including training and evaluation scripts, please refer to the [GitHub repository](https://github.com/MatteoFasulo/clef2025-checkthat).
 ## Citation
-If you find this model or the associated research useful, please consider citing the original paper:
 ```bibtex
-@article{aiwizards2025checkthat,
-  title={AI Wizards at CheckThat! 2025: Enhancing Transformer-Based Embeddings with Sentiment for Subjectivity Detection in News Articles},
-  author={AI Wizards team}, # Authors not fully listed in provided context, please refer to the full paper.
-  journal={arXiv preprint arXiv:2507.11764},
-  year={2025},
-  url={https://arxiv.org/abs/2507.11764}
 }
 ```

 model-index:
 - name: mdeberta-v3-base-subjectivity-sentiment-multilingual-no-arabic
   results: []
+datasets:
+- MatteoFasulo/clef2025_checkthat_task1_subjectivity
+language:
+- ar
+- de
+- bg
+- el
+- it
+- ro
+- uk
+- en
+- pl
 ---
 # mdeberta-v3-base-subjectivity-sentiment-multilingual-no-arabic
 You can use this model directly with the Hugging Face `transformers` library for text classification:
 ```python
+import torch
+import torch.nn as nn
+from transformers import DebertaV2Model, DebertaV2Config, AutoTokenizer, PreTrainedModel, pipeline, AutoModelForSequenceClassification
+from transformers.models.deberta.modeling_deberta import ContextPooler
+sent_pipe = pipeline(
+    "sentiment-analysis",
+    model="cardiffnlp/twitter-xlm-roberta-base-sentiment",
+    tokenizer="cardiffnlp/twitter-xlm-roberta-base-sentiment",
+    top_k=None,  # return all 3 sentiment scores
+)
+class CustomModel(PreTrainedModel):
+    config_class = DebertaV2Config
+    def __init__(self, config, sentiment_dim=3, num_labels=2, *args, **kwargs):
+        super().__init__(config, *args, **kwargs)
+        self.deberta = DebertaV2Model(config)
+        self.pooler = ContextPooler(config)
+        output_dim = self.pooler.output_dim
+        self.dropout = nn.Dropout(0.1)
+        self.classifier = nn.Linear(output_dim + sentiment_dim, num_labels)
+    def forward(self, input_ids, positive, neutral, negative, token_type_ids=None, attention_mask=None, labels=None):
+        outputs = self.deberta(input_ids=input_ids, attention_mask=attention_mask)
+        encoder_layer = outputs[0]
+        pooled_output = self.pooler(encoder_layer)
+        sentiment_features = torch.stack((positive, neutral, negative), dim=1).to(pooled_output.dtype)
+        combined_features = torch.cat((pooled_output, sentiment_features), dim=1)
+        logits = self.classifier(self.dropout(combined_features))
+        return {'logits': logits}
+model_name = "MatteoFasulo/mdeberta-v3-base-subjectivity-sentiment-multilingual"
+tokenizer = AutoTokenizer.from_pretrained("microsoft/mdeberta-v3-base")
+config = DebertaV2Config.from_pretrained(
+    model_name,
+    num_labels=2,
+    id2label={0: 'OBJ', 1: 'SUBJ'},
+    label2id={'OBJ': 0, 'SUBJ': 1},
+    output_attentions=False,
+    output_hidden_states=False
+)
+model = CustomModel(config=config, sentiment_dim=3, num_labels=2).from_pretrained(model_name)
+def classify_subjectivity(text: str):
+    # get full sentiment distribution
+    dist = sent_pipe(text)[0]
+    pos = next(d["score"] for d in dist if d["label"] == "positive")
+    neu = next(d["score"] for d in dist if d["label"] == "neutral")
+    neg = next(d["score"] for d in dist if d["label"] == "negative")
+    # tokenize the text
+    inputs = tokenizer(text, padding=True, truncation=True, max_length=256, return_tensors='pt')
+    # feeding in the three sentiment scores
+    with torch.no_grad():
+        outputs = model(
+            input_ids=inputs["input_ids"],
+            attention_mask=inputs["attention_mask"],
+            positive=torch.tensor(pos).unsqueeze(0).float(),
+            neutral=torch.tensor(neu).unsqueeze(0).float(),
+            negative=torch.tensor(neg).unsqueeze(0).float()
+        )
+    # compute probabilities and pick the top label
+    probs = torch.softmax(outputs.get('logits')[0], dim=-1)
+    label = model.config.id2label[int(probs.argmax())]
+    score = probs.max().item()
+    return {"label": label, "score": score}
+examples = [
+    "The company reported a 10% increase in revenue for the last quarter.",
+    "Die angegebenen Fehlerquoten können daher nur für symptomatische Patienten gelten.",
+    "Si smonta qui definitivamente la narrazione per cui le scelte energetiche possono essere frutto esclusivo di valutazioni “tecniche” e non politiche.",
+]
+for text in examples:
+    result = classify_subjectivity(text)
+    print(f"Text: {text}")
+    print(f"→ Subjectivity: {result['label']} (score={result['score']:.2f})\n")
 ```
 For more detailed usage, including training and evaluation scripts, please refer to the [GitHub repository](https://github.com/MatteoFasulo/clef2025-checkthat).
 ## Citation
+If you find our work helpful or inspiring, please feel free to cite it:
 ```bibtex
+@misc{fasulo2025aiwizardscheckthat2025,
+      title={AI Wizards at CheckThat! 2025: Enhancing Transformer-Based Embeddings with Sentiment for Subjectivity Detection in News Articles},
+      author={Matteo Fasulo and Luca Babboni and Luca Tedeschini},
+      year={2025},
+      eprint={2507.11764},
+      archivePrefix={arXiv},
+      primaryClass={cs.CL},
+      url={https://arxiv.org/abs/2507.11764},
 }
 ```