bitwise31337's picture
Create README.md
d8ceacc verified
---
license: cc-by-nc-4.0
pipeline_tag: text-classification
library_name: transformers
language: [en]
tags:
- media-bias
- lexical-bias
- paper:2411.11081
- naacl-2025
- coreset
datasets:
- mediabiasgroup/anno-lexical-coreset
base_model: roberta-base
model-index:
- name: RoBERTa-SA-FT (Anno-lexical coreset)
results:
- task: {type: text-classification, name: Lexical bias detection}
dataset: {name: BABE (test), type: mediabiasgroup/BABE}
metrics:
- {name: precision, type: precision, value: 0.829}
- {name: recall, type: recall, value: 0.859}
- {name: f1, type: f1, value: 0.844}
- {name: mcc, type: matthews_correlation, value: 0.638}
- task: {type: text-classification, name: Lexical bias detection}
dataset: {name: BASIL (all sentences), type: BASIL}
metrics:
- {name: precision, type: precision, value: 0.136}
- {name: recall, type: recall, value: 0.696}
- {name: f1, type: f1, value: 0.228}
- {name: mcc, type: matthews_correlation, value: 0.201}
---
# RoBERTa-SA-FT (Anno-lexical coreset)
This model is a **sentence-level media (lexical) bias classifier** trained on the **coreset** (BABE-scale) subset of the Anno-lexical dataset from
*“The Promises and Pitfalls of LLM Annotations in Dataset Labeling: a Case Study on Media Bias Detection”* (NAACL Findings 2025; arXiv:2411.11081).
It’s a `roberta-base` encoder with a 2-layer classification head. Labels are: `0 = neutral/non-lexical-bias`, `1 = lexical-bias`.
**Paper:** [The Promises and Pitfalls of LLM Annotations in Dataset Labeling: a Case Study on Media Bias Detection](https://arxiv.org/abs/2411.11081)
**Dataset:** [mediabiasgroup/anno-lexical-coreset](https://huggingface.co/datasets/mediabiasgroup/anno-lexical-coreset)
## Intended use & limitations
- **Intended use:** research on lexical/loaded-language bias; comparison to human-label fine-tuning under equal data size.
- **Out-of-scope:** detection of non-lexical media bias forms (e.g., informational/selection bias), leaning, stance, factuality.
- **Known caveats:** coreset-trained SA-FT tends to **increase recall** at the cost of **precision** compared to human-labeled FT.
## How to use
```python
from transformers import AutoTokenizer, AutoModelForSequenceClassification
m = "mediabiasgroup/roberta-anno-lexical-coreset-ft"
tok = AutoTokenizer.from_pretrained(m)
model = AutoModelForSequenceClassification.from_pretrained(m)
```
## Training data & setup
Training data: Anno-lexical coreset (BABE-scale) with binary labels aggregated from LLM annotations.
Base encoder: roberta-base; head: 2-layer classifier.
Hardware: single A100; single-run training.
## Evaluation
BABE (test): P 0.829 / R 0.859 / F1 0.844 / MCC 0.638
BASIL (all): P 0.136 / R 0.696 / F1 0.228 / MCC 0.201
(Positive class = lexical bias; BASIL informational bias treated as neutral.)
## Safety, bias & ethics
Media bias perception is subjective and culturally dependent. This model may over-flag biased wording and should not be used to penalize individuals or outlets. Use with human-in-the-loop review and domain-specific calibration.
## Citation
If you use this model, please cite:
```
@inproceedings{horych-etal-2025-promises,
title = "The Promises and Pitfalls of {LLM} Annotations in Dataset Labeling: a Case Study on Media Bias Detection",
author = "Horych, Tom{\'a}{\v{s}} and
Mandl, Christoph and
Ruas, Terry and
Greiner-Petter, Andre and
Gipp, Bela and
Aizawa, Akiko and
Spinde, Timo",
editor = "Chiruzzo, Luis and
Ritter, Alan and
Wang, Lu",
booktitle = "Findings of the Association for Computational Linguistics: NAACL 2025",
month = apr,
year = "2025",
address = "Albuquerque, New Mexico",
publisher = "Association for Computational Linguistics",
url = "https://aclanthology.org/2025.findings-naacl.75/",
doi = "10.18653/v1/2025.findings-naacl.75",
pages = "1370--1386",
ISBN = "979-8-89176-195-7"
}
```