--- license: mit datasets: - ucberkeley-dlab/measuring-hate-speech language: - en metrics: - accuracy - f1 - precision - recall library_name: transformers pipeline_tag: text-classification tags: - 'offensive language detection ' base_model: - distilbert/distilbert-base-uncased --- This model is part of the work done in .
The full code can be found at https://github.com/wetey/cluster-errors ## Model Details ### Model Description - **Model type:** Distil-BERT - **Language(s) (NLP):** English - **Finetuned from model:** distilbert-base-uncased ## How to Get Started with the Model Use the code below to get started with the model. ```python # Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-classification", model="wetey/distilbert-base-uncased-measuring-hate-speech") ``` ```python # Load model directly from transformers import AutoTokenizer, AutoModelForSequenceClassification tokenizer = AutoTokenizer.from_pretrained("wetey/distilbert-base-uncased-measuring-hate-speech") model = AutoModelForSequenceClassification.from_pretrained("wetey/distilbert-base-uncased-measuring-hate-speech") ``` ## Fine-tuning Details ### Fine-tuning Data The model was fine-tuned on the [ucberkeley-dlab/measuring-hate-speech](https://huggingface.co/datasets/ucberkeley-dlab/measuring-hate-speech) dataset.
We converted the continuous hatespeech scores to categorical labels using the ranges suggested by the authors. The ranges are listed on the [HuggingFace Dataset card](https://huggingface.co/datasets/ucberkeley-dlab/measuring-hate-speech).
Examples with hatespeech scores that are lower than -1 are considered `supportive`, between -1 and 0.5 are `neutral`, and scores greater than 0.5 are `hatespeech`.
We remove duplicate examples along with those that received fewer than three total annotations, and we drop the neutral class.
After these steps, we were left with 12,289 examples with 7497 examples labeled as `supportive` and 4792 labeled as `hatespeech`. We use 85\% of the dataset for fine-tuning and 15\% for testing. ### Fine-tuning Procedure The exact fine-tuning procedure followed can be found [here](https://github.com/wetey/cluster-errors/tree/master/finetuning) #### Fine-tuning Hyperparameters evaluation_strategy = 'epoch' logging_steps = 1, num_train_epochs = 5, learning_rate = 1e-5, eval_accumulation_steps = 2 ## Evaluation ### Testing Data Test set used can be found [here](https://github.com/wetey/cluster-errors/tree/master/data/datasets) ### Results `accuracy`: 89.3%
`precision`: 89.4%
`recall`: 89.3%
`f1-score`: 89.3%
#### Results per class | Label | Precision | Recall | F1-score| |---------|---------|---------|---------| | supportive | 92% | 91% | 91% | | hatespeech| 86% | 87% | 86% | ## Citation