wetey commited on
Commit
8210110
·
verified ·
1 Parent(s): f5998ac

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +43 -44
README.md CHANGED
@@ -10,13 +10,14 @@ metrics:
10
  library_name: transformers
11
  tags:
12
  - offensive language detection
 
 
13
  ---
14
 
15
- # Model Card for Model ID
16
 
17
- <!-- Provide a quick summary of what the model is/does. -->
 
18
 
19
- This modelcard aims to be a base template for new models. It has been generated using [this raw template](https://github.com/huggingface/huggingface_hub/blob/main/src/huggingface_hub/templates/modelcard_template.md?plain=1).
20
 
21
  ## Model Details
22
 
@@ -26,70 +27,68 @@ This modelcard aims to be a base template for new models. It has been generated
26
 
27
  - **Model type:** BERT-based
28
  - **Language(s) (NLP):** Arabic
29
- - **License:** [More Information Needed]
30
  - **Finetuned from model:** UBC-NLP/MARBERT
31
 
32
- ## Training Details
33
 
34
- ### Training Data
35
 
36
- <!-- This should link to a Dataset Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->
 
 
37
 
38
- [More Information Needed]
39
 
40
- ### Training Procedure
41
 
42
- <!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. -->
 
 
43
 
44
- #### Preprocessing [optional]
 
45
 
46
- [More Information Needed]
47
 
 
48
 
49
- #### Training Hyperparameters
50
 
51
- - **Training regime:** [More Information Needed] <!--fp32, fp16 mixed precision, bf16 mixed precision, bf16 non-mixed precision, fp16 non-mixed precision, fp8 mixed precision -->
52
 
53
- #### Speeds, Sizes, Times [optional]
54
 
55
- <!-- This section provides information about throughput, start/end time, checkpoint size if relevant, etc. -->
56
 
57
- [More Information Needed]
58
 
 
 
 
 
 
 
59
  ## Evaluation
60
 
61
  <!-- This section describes the evaluation protocols and provides the results. -->
62
 
63
- ### Testing Data, Factors & Metrics
64
-
65
- #### Testing Data
66
-
67
- <!-- This should link to a Dataset Card if possible. -->
68
-
69
- [More Information Needed]
70
-
71
- #### Factors
72
-
73
- <!-- These are the things the evaluation is disaggregating by, e.g., subpopulations or domains. -->
74
 
75
- [More Information Needed]
76
-
77
- #### Metrics
78
-
79
- <!-- These are the evaluation metrics being used, ideally with a description of why. -->
80
-
81
- [More Information Needed]
82
 
83
  ### Results
84
 
85
- [More Information Needed]
86
-
87
- #### Summary
88
-
89
- ## Model Card Authors [optional]
90
-
91
- [More Information Needed]
92
 
93
- ## Model Card Contact
 
 
 
 
 
94
 
95
- [More Information Needed]
 
 
10
  library_name: transformers
11
  tags:
12
  - offensive language detection
13
+ base_model:
14
+ - UBC-NLP/MARBERT
15
  ---
16
 
 
17
 
18
+ This model is part of the work done in <!-- add paper name -->. <br>
19
+ The full code can be found at <!-- github repo url -->
20
 
 
21
 
22
  ## Model Details
23
 
 
27
 
28
  - **Model type:** BERT-based
29
  - **Language(s) (NLP):** Arabic
 
30
  - **Finetuned from model:** UBC-NLP/MARBERT
31
 
32
+ ## How to Get Started with the Model
33
 
34
+ Use the code below to get started with the model.
35
 
36
+ ```python
37
+ # Use a pipeline as a high-level helper
38
+ from transformers import pipeline
39
 
40
+ pipe = pipeline("text-classification", model="wetey/MARBERT-LHSAB")
41
 
42
+ ```
43
 
44
+ ```python
45
+ # Load model directly
46
+ from transformers import AutoTokenizer, AutoModelForSequenceClassification
47
 
48
+ tokenizer = AutoTokenizer.from_pretrained("wetey/MARBERT-LHSAB")
49
+ model = AutoModelForSequenceClassification.from_pretrained("wetey/MARBERT-LHSAB")
50
 
51
+ ```
52
 
53
+ ## Fine-tuning Details
54
 
55
+ ### Fine-tuning Data
56
 
57
+ This model is fine-tuned on the [L-HSAB](https://github.com/Hala-Mulki/L-HSAB-First-Arabic-Levantine-HateSpeech-Dataset). The exact version we use (after removing duplicates) can be found [](). <!--TODO-->
58
 
59
+ ### Fine-tuning Procedure
60
 
61
+ The exact fine-tuning procedure followed can be found at []() <!--TODO-->
62
 
63
+ #### Training Hyperparameters
64
 
65
+ evaluation_strategy = 'epoch'
66
+ logging_steps = 1,
67
+ num_train_epochs = 5,
68
+ learning_rate = 1e-5,
69
+ eval_accumulation_steps = 2
70
+
71
  ## Evaluation
72
 
73
  <!-- This section describes the evaluation protocols and provides the results. -->
74
 
75
+ ### Testing Data
 
 
 
 
 
 
 
 
 
 
76
 
77
+ Test set used can be found at []()<!--TODO-->
 
 
 
 
 
 
78
 
79
  ### Results
80
 
81
+ `accuracy`: 87.9% <br>
82
+ `precision`: 88.1% <br>
83
+ `recall`: 87.9% <br>
84
+ `f1-score`: 87.9% <br>
 
 
 
85
 
86
+ #### Results per class
87
+ | Label | Precision | Recall | F1-score|
88
+ |---------|---------|---------|---------|
89
+ | normal | 85% | 82% | 83% |
90
+ | abusive | 93% | 92% | 93% |
91
+ | hate | 68% | 78% | 72% |
92
 
93
+ ## Citation
94
+ <!--TODO-->