File size: 2,163 Bytes
2692c08
 
 
 
 
 
 
 
 
 
 
 
8210110
 
2692c08
 
 
8210110
a16a63c
2692c08
 
 
 
 
 
 
 
f5998ac
 
 
2692c08
8210110
2692c08
8210110
2692c08
8210110
 
 
2692c08
8210110
2692c08
8210110
2692c08
8210110
 
 
2692c08
8210110
 
2692c08
8210110
2692c08
8210110
2692c08
8210110
2692c08
8210110
2692c08
8210110
2692c08
a16a63c
2692c08
8210110
2692c08
8210110
 
 
 
 
 
2692c08
 
 
 
8210110
2692c08
a16a63c
2692c08
 
 
8210110
 
 
 
2692c08
8210110
 
 
 
 
 
2692c08
8210110
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
---
license: mit
language:
- ar
metrics:
- accuracy
- f1
- precision
- recall
library_name: transformers
tags:
- offensive language detection
base_model:
- UBC-NLP/MARBERT
---


This model is part of the work done in <!-- add paper name -->. <br>
The full code can be found at https://github.com/wetey/cluster-errors


## Model Details

### Model Description

<!-- Provide a longer summary of what this model is. -->

- **Model type:** BERT-based
- **Language(s) (NLP):** Arabic
- **Finetuned from model:** UBC-NLP/MARBERT

## How to Get Started with the Model

Use the code below to get started with the model.

```python
# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-classification", model="wetey/MARBERT-LHSAB")

```

```python
# Load model directly
from transformers import AutoTokenizer, AutoModelForSequenceClassification

tokenizer = AutoTokenizer.from_pretrained("wetey/MARBERT-LHSAB")
model = AutoModelForSequenceClassification.from_pretrained("wetey/MARBERT-LHSAB")

```

## Fine-tuning Details

### Fine-tuning Data

This model is fine-tuned on the [L-HSAB](https://github.com/Hala-Mulki/L-HSAB-First-Arabic-Levantine-HateSpeech-Dataset). The exact version we use (after removing duplicates) can be found [](). <!--TODO-->

### Fine-tuning Procedure

The exact fine-tuning procedure followed can be found [here](https://github.com/wetey/cluster-errors/tree/master/finetuning) 

#### Training Hyperparameters

    evaluation_strategy = 'epoch'
    logging_steps = 1,
    num_train_epochs = 5,
    learning_rate = 1e-5,
    eval_accumulation_steps = 2
    
## Evaluation

<!-- This section describes the evaluation protocols and provides the results. -->

### Testing Data 

  Test set used can be found [here](https://github.com/wetey/cluster-errors/tree/master/data/datasets)

### Results

`accuracy`: 87.9% <br>
`precision`: 88.1% <br>
`recall`: 87.9% <br>
`f1-score`: 87.9% <br>

 #### Results per class 
| Label | Precision | Recall | F1-score|
|---------|---------|---------|---------|
| normal | 85% | 82% | 83% |
| abusive | 93% | 92% | 93% |
| hate | 68% | 78% | 72% |

## Citation
<!--TODO-->