mdeberta-v3-base-subjectivity-sentiment-english

This model is a fine-tuned version of microsoft/mdeberta-v3-base for subjectivity detection in news articles. It was developed as part of AI Wizards' participation in the CLEF 2025 CheckThat! Lab Task 1.

It achieves the following results on the evaluation set:

  • Loss: 0.6011
  • Macro F1: 0.7727
  • Macro P: 0.7749
  • Macro R: 0.7743
  • Subj F1: 0.7702
  • Subj P: 0.8111
  • Subj R: 0.7333
  • Accuracy: 0.7727

Model description

This model is a transformer-based classifier designed to classify sentences as subjective (opinion-laden) or objective within news articles. It was presented in the paper "AI Wizards at CheckThat! 2025: Enhancing Transformer-Based Embeddings with Sentiment for Subjectivity Detection in News Articles".

A key innovation of this model is its sentiment-augmented architecture. It enhances standard transformer-based embeddings (specifically, mDeBERTaV3-base) by integrating sentiment scores, derived from an auxiliary model, with sentence representations. This approach aims to significantly improve performance compared to standard fine-tuning, particularly boosting the subjective F1 score. Additionally, the training framework employs decision threshold calibration to effectively address class imbalance, which is prevalent across languages.

Intended uses & limitations

This model is intended for subjectivity detection in news articles, classifying sentences as subjective (opinion-laden) or objective. This functionality is a key component in combating misinformation, improving fact-checking pipelines, and supporting journalists in content analysis.

Intended Uses:

  • Classifying English news sentences as subjective or objective.
  • Assisting in fact-checking, media analysis, and news aggregation by distinguishing between factual and opinionated content.

Limitations:

  • This specific model (mdeberta-v3-base-subjectivity-sentiment-english) is primarily optimized for and evaluated on English news articles. While the broader research explored multilingual and zero-shot settings, performance on other languages might vary.
  • The system's initial performance for multilingual settings in the official challenge was affected by a submission mistake, which has since been corrected in internal evaluations, indicating better actual performance than initially reported in some cases.
  • As with any trained model, potential biases present in the training data could be reflected in its predictions.

Training and evaluation data

The model was trained and developed using datasets provided for the CLEF 2025 CheckThat! Lab Task 1: Subjectivity Detection in News Articles. These datasets included text in Arabic, German, English, Italian, and Bulgarian for training/development. For this specific English model, the English training and development datasets were utilized.

The training strategy involved enhancing transformer-based classifiers by integrating sentiment scores, derived from an auxiliary model, with sentence representations. This approach aimed to improve the detection of subjective sentences. To address class imbalance prevalent across languages, decision threshold calibration was employed, optimized on the development set. Final evaluation included additional unseen languages (e.g., Greek, Romanian, Polish, Ukrainian) to assess generalization capabilities for other models in the AI Wizards' participation.

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 1e-05
  • train_batch_size: 16
  • eval_batch_size: 16
  • seed: 42
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • num_epochs: 6

Training results

Training Loss Epoch Step Validation Loss Macro F1 Macro P Macro R Subj F1 Subj P Subj R Accuracy
No log 1.0 52 0.7082 0.3383 0.7418 0.5062 0.0247 1.0 0.0125 0.4870
No log 2.0 104 0.6365 0.7639 0.7639 0.7643 0.7696 0.7811 0.7583 0.7641
No log 3.0 156 0.8187 0.6510 0.7128 0.6732 0.5822 0.8244 0.45 0.6645
No log 4.0 208 0.6188 0.7594 0.7653 0.7623 0.7506 0.8146 0.6958 0.7597
No log 5.0 260 0.6048 0.7617 0.7659 0.7641 0.7556 0.8095 0.7083 0.7619
No log 6.0 312 0.6011 0.7727 0.7749 0.7743 0.7702 0.8111 0.7333 0.7727

Framework versions

  • Transformers 4.49.0
  • Pytorch 2.5.1+cu121
  • Datasets 3.3.1
  • Tokenizers 0.21.0

How to use

You can use this model directly with the transformers pipeline for text classification:

import torch
import torch.nn as nn
from transformers import DebertaV2Model, DebertaV2Config, AutoTokenizer, PreTrainedModel, pipeline, AutoModelForSequenceClassification 
from transformers.models.deberta.modeling_deberta import ContextPooler

sent_pipe = pipeline(
    "sentiment-analysis",
    model="cardiffnlp/twitter-xlm-roberta-base-sentiment",
    tokenizer="cardiffnlp/twitter-xlm-roberta-base-sentiment",
    top_k=None,  # return all 3 sentiment scores
)

class CustomModel(PreTrainedModel):
    config_class = DebertaV2Config
    def __init__(self, config, sentiment_dim=3, num_labels=2, *args, **kwargs):
        super().__init__(config, *args, **kwargs)
        self.deberta = DebertaV2Model(config)
        self.pooler = ContextPooler(config)
        output_dim = self.pooler.output_dim
        self.dropout = nn.Dropout(0.1)
        self.classifier = nn.Linear(output_dim + sentiment_dim, num_labels)

    def forward(self, input_ids, positive, neutral, negative, token_type_ids=None, attention_mask=None, labels=None):
        outputs = self.deberta(input_ids=input_ids, attention_mask=attention_mask)
        encoder_layer = outputs[0]
        pooled_output = self.pooler(encoder_layer)
        sentiment_features = torch.stack((positive, neutral, negative), dim=1).to(pooled_output.dtype)
        combined_features = torch.cat((pooled_output, sentiment_features), dim=1)
        logits = self.classifier(self.dropout(combined_features))
        return {'logits': logits}

model_name = "MatteoFasulo/mdeberta-v3-base-subjectivity-sentiment-english"
tokenizer = AutoTokenizer.from_pretrained("microsoft/mdeberta-v3-base")
config = DebertaV2Config.from_pretrained(
    model_name, 
    num_labels=2, 
    id2label={0: 'OBJ', 1: 'SUBJ'}, 
    label2id={'OBJ': 0, 'SUBJ': 1},
    output_attentions=False, 
    output_hidden_states=False
)
model = CustomModel(config=config, sentiment_dim=3, num_labels=2).from_pretrained(model_name)

def classify_subjectivity(text: str):
    # get full sentiment distribution
    dist = sent_pipe(text)[0]
    pos = next(d["score"] for d in dist if d["label"] == "positive")
    neu = next(d["score"] for d in dist if d["label"] == "neutral")
    neg = next(d["score"] for d in dist if d["label"] == "negative")

    # tokenize the text
    inputs = tokenizer(text, padding=True, truncation=True, max_length=256, return_tensors='pt')

    # feeding in the three sentiment scores
    with torch.no_grad():
        outputs = model(
            input_ids=inputs["input_ids"],
            attention_mask=inputs["attention_mask"],
            positive=torch.tensor(pos).unsqueeze(0).float(),
            neutral=torch.tensor(neu).unsqueeze(0).float(),
            negative=torch.tensor(neg).unsqueeze(0).float()
        )

    # compute probabilities and pick the top label
    probs = torch.softmax(outputs.get('logits')[0], dim=-1)
    label = model.config.id2label[int(probs.argmax())]
    score = probs.max().item()

    return {"label": label, "score": score}

examples = [
    "The company reported a 10% increase in revenue for the last quarter.",
    "And it could even be used to gather intelligence on Russian operations.",
    "Dramatic pictures the next day show the charred and hollowed out relic of a once impressive and key Russian vessel.",
    "Demands upon the public credit for social service are most difficult to resist."
]
for text in examples:
    result = classify_subjectivity(text)
    print(f"Text: {text}")
    print(f"→ Subjectivity: {result['label']} (score={result['score']:.2f})\n")

Code

The official code and materials for this project are available on GitHub: https://github.com/MatteoFasulo/clef2025-checkthat

Citation

If you find our work helpful or inspiring, please feel free to cite it:

@misc{fasulo2025aiwizardscheckthat2025,
      title={AI Wizards at CheckThat! 2025: Enhancing Transformer-Based Embeddings with Sentiment for Subjectivity Detection in News Articles}, 
      author={Matteo Fasulo and Luca Babboni and Luca Tedeschini},
      year={2025},
      eprint={2507.11764},
      archivePrefix={arXiv},
      primaryClass={cs.CL},
      url={https://arxiv.org/abs/2507.11764}, 
}
Downloads last month
92
Safetensors
Model size
279M params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for MatteoFasulo/mdeberta-v3-base-subjectivity-sentiment-english

Finetuned
(202)
this model

Dataset used to train MatteoFasulo/mdeberta-v3-base-subjectivity-sentiment-english

Collection including MatteoFasulo/mdeberta-v3-base-subjectivity-sentiment-english