File size: 2,163 Bytes
2692c08 8210110 2692c08 8210110 a16a63c 2692c08 f5998ac 2692c08 8210110 2692c08 8210110 2692c08 8210110 2692c08 8210110 2692c08 8210110 2692c08 8210110 2692c08 8210110 2692c08 8210110 2692c08 8210110 2692c08 8210110 2692c08 8210110 2692c08 8210110 2692c08 a16a63c 2692c08 8210110 2692c08 8210110 2692c08 8210110 2692c08 a16a63c 2692c08 8210110 2692c08 8210110 2692c08 8210110 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 |
---
license: mit
language:
- ar
metrics:
- accuracy
- f1
- precision
- recall
library_name: transformers
tags:
- offensive language detection
base_model:
- UBC-NLP/MARBERT
---
This model is part of the work done in <!-- add paper name -->. <br>
The full code can be found at https://github.com/wetey/cluster-errors
## Model Details
### Model Description
<!-- Provide a longer summary of what this model is. -->
- **Model type:** BERT-based
- **Language(s) (NLP):** Arabic
- **Finetuned from model:** UBC-NLP/MARBERT
## How to Get Started with the Model
Use the code below to get started with the model.
```python
# Use a pipeline as a high-level helper
from transformers import pipeline
pipe = pipeline("text-classification", model="wetey/MARBERT-LHSAB")
```
```python
# Load model directly
from transformers import AutoTokenizer, AutoModelForSequenceClassification
tokenizer = AutoTokenizer.from_pretrained("wetey/MARBERT-LHSAB")
model = AutoModelForSequenceClassification.from_pretrained("wetey/MARBERT-LHSAB")
```
## Fine-tuning Details
### Fine-tuning Data
This model is fine-tuned on the [L-HSAB](https://github.com/Hala-Mulki/L-HSAB-First-Arabic-Levantine-HateSpeech-Dataset). The exact version we use (after removing duplicates) can be found [](). <!--TODO-->
### Fine-tuning Procedure
The exact fine-tuning procedure followed can be found [here](https://github.com/wetey/cluster-errors/tree/master/finetuning)
#### Training Hyperparameters
evaluation_strategy = 'epoch'
logging_steps = 1,
num_train_epochs = 5,
learning_rate = 1e-5,
eval_accumulation_steps = 2
## Evaluation
<!-- This section describes the evaluation protocols and provides the results. -->
### Testing Data
Test set used can be found [here](https://github.com/wetey/cluster-errors/tree/master/data/datasets)
### Results
`accuracy`: 87.9% <br>
`precision`: 88.1% <br>
`recall`: 87.9% <br>
`f1-score`: 87.9% <br>
#### Results per class
| Label | Precision | Recall | F1-score|
|---------|---------|---------|---------|
| normal | 85% | 82% | 83% |
| abusive | 93% | 92% | 93% |
| hate | 68% | 78% | 72% |
## Citation
<!--TODO-->
|