MARBERT-LHSAB / README.md
wetey's picture
Update README.md
a16a63c verified
---
license: mit
language:
- ar
metrics:
- accuracy
- f1
- precision
- recall
library_name: transformers
tags:
- offensive language detection
base_model:
- UBC-NLP/MARBERT
---
This model is part of the work done in <!-- add paper name -->. <br>
The full code can be found at https://github.com/wetey/cluster-errors
## Model Details
### Model Description
<!-- Provide a longer summary of what this model is. -->
- **Model type:** BERT-based
- **Language(s) (NLP):** Arabic
- **Finetuned from model:** UBC-NLP/MARBERT
## How to Get Started with the Model
Use the code below to get started with the model.
```python
# Use a pipeline as a high-level helper
from transformers import pipeline
pipe = pipeline("text-classification", model="wetey/MARBERT-LHSAB")
```
```python
# Load model directly
from transformers import AutoTokenizer, AutoModelForSequenceClassification
tokenizer = AutoTokenizer.from_pretrained("wetey/MARBERT-LHSAB")
model = AutoModelForSequenceClassification.from_pretrained("wetey/MARBERT-LHSAB")
```
## Fine-tuning Details
### Fine-tuning Data
This model is fine-tuned on the [L-HSAB](https://github.com/Hala-Mulki/L-HSAB-First-Arabic-Levantine-HateSpeech-Dataset). The exact version we use (after removing duplicates) can be found [](). <!--TODO-->
### Fine-tuning Procedure
The exact fine-tuning procedure followed can be found [here](https://github.com/wetey/cluster-errors/tree/master/finetuning)
#### Training Hyperparameters
evaluation_strategy = 'epoch'
logging_steps = 1,
num_train_epochs = 5,
learning_rate = 1e-5,
eval_accumulation_steps = 2
## Evaluation
<!-- This section describes the evaluation protocols and provides the results. -->
### Testing Data
Test set used can be found [here](https://github.com/wetey/cluster-errors/tree/master/data/datasets)
### Results
`accuracy`: 87.9% <br>
`precision`: 88.1% <br>
`recall`: 87.9% <br>
`f1-score`: 87.9% <br>
#### Results per class
| Label | Precision | Recall | F1-score|
|---------|---------|---------|---------|
| normal | 85% | 82% | 83% |
| abusive | 93% | 92% | 93% |
| hate | 68% | 78% | 72% |
## Citation
<!--TODO-->