File size: 2,167 Bytes
cce0274 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 |
# cardiffnlp/twitter-xlm-roberta-base-hate-spanish
This model is a fine-tuned version of [cardiffnlp/twitter-xlm-roberta-base](https://huggingface.co/cardiffnlp/twitter-xlm-roberta-base) using the [`HaterNet`](https://zenodo.org/record/2592149) dataset and the Spanish subset of
[`SemEval-2019 Task 5`](https://aclanthology.org/S19-2007/).
## Following metrics are achieved
* `on the test split of SemEval-2019 Task 5`
- F1 (weighted): 0.7866
- F1 (macro): 0.7935
- Accuracy: 0.7937
* on custom test split of `Haternet`
- F1 (weighted): 0.7815
- F1 (macro): 0.6981
- Accuracy: 0.7933
* on `Haternet` & `SemEval-2019 Task 5`
- F1 (weighted): 0.7908
- F1 (macro): 0.7657
- Accuracy: 0.7936
### Usage
Install tweetnlp via pip.
```shell
pip install tweetnlp
```
Load the model in python.
```python
import tweetnlp
model = tweetnlp.Classifier("cardiffnlp/twitter-xlm-roberta-base-hate-spanish")
model.predict('Ismael es egocentrico porque se vuelve loca si le dicen que tiene el pelo bonito๐๐๐๐ eso se define con otro objetivo #FirstDates251')
>> {'label': 'NOT-HATE'}
```
### Datasets
@inproceedings{basile-etal-2019-semeval,
title = "{S}em{E}val-2019 Task 5: Multilingual Detection of Hate Speech Against Immigrants and Women in {T}witter",
author = "Basile, Valerio and
Bosco, Cristina and
Fersini, Elisabetta and
Nozza, Debora and
Patti, Viviana and
Rangel Pardo, Francisco Manuel and
Rosso, Paolo and
Sanguinetti, Manuela",
booktitle = "Proceedings of the 13th International Workshop on Semantic Evaluation",
month = jun,
year = "2019",
address = "Minneapolis, Minnesota, USA",
publisher = "Association for Computational Linguistics",
url = "https://aclanthology.org/S19-2007",
doi = "10.18653/v1/S19-2007",
pages = "54--63",
}
@article{quijano2019haternet,
title={HaterNet a system for detecting and analyzing hate speech in Twitter (Version 1.0)[Data set]},
author={Quijano-Sanchez, Lara and Kohatsu, Juan Carlos Pereira and Liberatore, Federico and Camacho-Collados, Miguel},
journal={Zenodo},
year={2019}
}
|