wetey
/

MARBERT-LHSAB

Text Classification

offensive language detection

Model card Files Files and versions Community

MARBERT-LHSAB / README.md

wetey's picture

Update README.md

a16a63c verified 13 days ago

|

history blame contribute delete

2.16 kB

	---
	license: mit
	language:
	- ar
	metrics:
	- accuracy
	- f1
	- precision
	- recall
	library_name: transformers
	tags:
	- offensive language detection
	base_model:
	- UBC-NLP/MARBERT
	---


	This model is part of the work done in <!-- add paper name -->. <br>
	The full code can be found at https://github.com/wetey/cluster-errors


	## Model Details

	### Model Description

	<!-- Provide a longer summary of what this model is. -->

	- Model type: BERT-based
	- Language(s) (NLP): Arabic
	- Finetuned from model: UBC-NLP/MARBERT

	## How to Get Started with the Model

	Use the code below to get started with the model.

	```python
	# Use a pipeline as a high-level helper
	from transformers import pipeline

	pipe = pipeline("text-classification", model="wetey/MARBERT-LHSAB")

	```

	```python
	# Load model directly
	from transformers import AutoTokenizer, AutoModelForSequenceClassification

	tokenizer = AutoTokenizer.from_pretrained("wetey/MARBERT-LHSAB")
	model = AutoModelForSequenceClassification.from_pretrained("wetey/MARBERT-LHSAB")

	```

	## Fine-tuning Details

	### Fine-tuning Data

	This model is fine-tuned on the [L-HSAB](https://github.com/Hala-Mulki/L-HSAB-First-Arabic-Levantine-HateSpeech-Dataset). The exact version we use (after removing duplicates) can be found [](). <!--TODO-->

	### Fine-tuning Procedure

	The exact fine-tuning procedure followed can be found [here](https://github.com/wetey/cluster-errors/tree/master/finetuning)

	#### Training Hyperparameters

	evaluation_strategy = 'epoch'
	logging_steps = 1,
	num_train_epochs = 5,
	learning_rate = 1e-5,
	eval_accumulation_steps = 2

	## Evaluation

	<!-- This section describes the evaluation protocols and provides the results. -->

	### Testing Data

	Test set used can be found [here](https://github.com/wetey/cluster-errors/tree/master/data/datasets)

	### Results

	`accuracy`: 87.9% <br>
	`precision`: 88.1% <br>
	`recall`: 87.9% <br>
	`f1-score`: 87.9% <br>

	#### Results per class
	\| Label \| Precision \| Recall \| F1-score\|
	\|---------\|---------\|---------\|---------\|
	\| normal \| 85% \| 82% \| 83% \|
	\| abusive \| 93% \| 92% \| 93% \|
	\| hate \| 68% \| 78% \| 72% \|

	## Citation
	<!--TODO-->