RoBERTa-SA-FT (Anno-lexical coreset)
This model is a sentence-level media (lexical) bias classifier trained on the coreset (BABE-scale) subset of the Anno-lexical dataset from
“The Promises and Pitfalls of LLM Annotations in Dataset Labeling: a Case Study on Media Bias Detection” (NAACL Findings 2025; arXiv:2411.11081).
It’s a roberta-base
encoder with a 2-layer classification head. Labels are: 0 = neutral/non-lexical-bias
, 1 = lexical-bias
.
Paper: The Promises and Pitfalls of LLM Annotations in Dataset Labeling: a Case Study on Media Bias Detection
Dataset: mediabiasgroup/anno-lexical-coreset
Intended use & limitations
- Intended use: research on lexical/loaded-language bias; comparison to human-label fine-tuning under equal data size.
- Out-of-scope: detection of non-lexical media bias forms (e.g., informational/selection bias), leaning, stance, factuality.
- Known caveats: coreset-trained SA-FT tends to increase recall at the cost of precision compared to human-labeled FT.
How to use
from transformers import AutoTokenizer, AutoModelForSequenceClassification
m = "mediabiasgroup/roberta-anno-lexical-coreset-ft"
tok = AutoTokenizer.from_pretrained(m)
model = AutoModelForSequenceClassification.from_pretrained(m)
Training data & setup
Training data: Anno-lexical coreset (BABE-scale) with binary labels aggregated from LLM annotations.
Base encoder: roberta-base; head: 2-layer classifier.
Hardware: single A100; single-run training.
Evaluation
BABE (test): P 0.829 / R 0.859 / F1 0.844 / MCC 0.638
BASIL (all): P 0.136 / R 0.696 / F1 0.228 / MCC 0.201 (Positive class = lexical bias; BASIL informational bias treated as neutral.)
Safety, bias & ethics
Media bias perception is subjective and culturally dependent. This model may over-flag biased wording and should not be used to penalize individuals or outlets. Use with human-in-the-loop review and domain-specific calibration.
Citation
If you use this model, please cite:
@inproceedings{horych-etal-2025-promises,
title = "The Promises and Pitfalls of {LLM} Annotations in Dataset Labeling: a Case Study on Media Bias Detection",
author = "Horych, Tom{\'a}{\v{s}} and
Mandl, Christoph and
Ruas, Terry and
Greiner-Petter, Andre and
Gipp, Bela and
Aizawa, Akiko and
Spinde, Timo",
editor = "Chiruzzo, Luis and
Ritter, Alan and
Wang, Lu",
booktitle = "Findings of the Association for Computational Linguistics: NAACL 2025",
month = apr,
year = "2025",
address = "Albuquerque, New Mexico",
publisher = "Association for Computational Linguistics",
url = "https://aclanthology.org/2025.findings-naacl.75/",
doi = "10.18653/v1/2025.findings-naacl.75",
pages = "1370--1386",
ISBN = "979-8-89176-195-7"
}
- Downloads last month
- 25
Model tree for mediabiasgroup/roberta-anno-lexical-coreset-ft
Base model
FacebookAI/roberta-baseDataset used to train mediabiasgroup/roberta-anno-lexical-coreset-ft
Collection including mediabiasgroup/roberta-anno-lexical-coreset-ft
Evaluation results
- precision on BABE (test)self-reported0.829
- recall on BABE (test)self-reported0.859
- f1 on BABE (test)self-reported0.844
- mcc on BABE (test)self-reported0.638
- precision on BASIL (all sentences)self-reported0.136
- recall on BASIL (all sentences)self-reported0.696
- f1 on BASIL (all sentences)self-reported0.228
- mcc on BASIL (all sentences)self-reported0.201