arxiv:2507.11764

AI Wizards at CheckThat! 2025: Enhancing Transformer-Based Embeddings with Sentiment for Subjectivity Detection in News Articles

Published on Jul 15

· Submitted by

MatteoFasulo on Jul 17

Upvote

Authors:

Matteo Fasulo ,

Luca Babboni ,

Luca Tedeschini

Abstract

Sentiment-augmented transformer-based classifiers improve subjectivity detection in multilingual and zero-shot settings, achieving high performance and ranking first for Greek.

AI-generated summary

This paper presents AI Wizards' participation in the CLEF 2025 CheckThat! Lab Task 1: Subjectivity Detection in News Articles, classifying sentences as subjective/objective in monolingual, multilingual, and zero-shot settings. Training/development datasets were provided for Arabic, German, English, Italian, and Bulgarian; final evaluation included additional unseen languages (e.g., Greek, Romanian, Polish, Ukrainian) to assess generalization. Our primary strategy enhanced transformer-based classifiers by integrating sentiment scores, derived from an auxiliary model, with sentence representations, aiming to improve upon standard fine-tuning. We explored this sentiment-augmented architecture with mDeBERTaV3-base, ModernBERT-base (English), and Llama3.2-1B. To address class imbalance, prevalent across languages, we employed decision threshold calibration optimized on the development set. Our experiments show sentiment feature integration significantly boosts performance, especially subjective F1 score. This framework led to high rankings, notably 1st for Greek (Macro F1 = 0.51).

View arXiv page View PDF GitHub 2 Add to collection

Community

MatteoFasulo

Paper author Paper submitter 1 day ago

•

edited 1 day ago

AI Wizards — “Enhancing Transformer‑Based Embeddings with Sentiment for Subjectivity Detection in News Articles”
arXiv:2507.11764 | GitHub: MatteoFasulo/clef2025-checkthat

Announcement:
We are pleased to introduce our latest work from the CLEF 2025 CheckThat! Lab (Task 1). This paper outlines a novel framework for classifying sentences in news articles as subjective or objective integrating sentiment features into transformer embeddings.

Leveraging mDeBERTa v3, ModernBERT, and LLaMA 3.2‑1B across monolingual (Arabic, German, English, Italian, Bulgarian), multilingual, and zero‑shot (Greek, Polish, Romanian, Ukrainian) settings, our approach augments contextual embeddings with sentiment scores from a sentiment analysis model. This enhancement, combined with calibrated decision thresholds to address dataset imbalance, consistently improves performance—especially on the subjective class.

Key results include top placement in the Greek zero‑shot track and improved SUBJ F1 in English and Italian.
The full codebase and datasets are publicly available.

Where to find it: