|
--- |
|
license: apache-2.0 |
|
tags: |
|
- Irony Detection |
|
- Text Classification |
|
- tweet_eval |
|
|
|
|
|
model-index: |
|
- name: roberta-base-finetuned-irony |
|
results: [] |
|
--- |
|
|
|
# roberta-base-finetuned-irony |
|
|
|
This model is a fine-tuned version of [roberta-base](https://huggingface.co/roberta-base) on the Irony Dataset from [Tweet_Eval](https://huggingface.co/datasets/tweet_eval). |
|
|
|
This is the classification report after training for 10 full epochs: |
|
|
|
| | Precision | Recall | F-1 Score | Support | |
|
|:-------------:|:-----:|:----:|:---------------:|:--------:| |
|
| Not Irony (0) | 0.73 | 0.78| 0.75 | 473 | |
|
| Irony (1) | 0.62 | 0.56 | 0.59 | 311 | |
|
| accuracy | | | 0.69 | 784 | |
|
| macro avg | 0.68 | 0.67 | 0.67 | 784 | |
|
| weighted avg | 0.69 | 0.69 | 0.69 | 784 | |
|
|
|
## Training and evaluation data |
|
|
|
All of the process to train this model is available in [this](https://github.com/vikram71198/Transformers/tree/main/Irony%20Detection) repository. The dataset has been split into 2,862 examples for training, 955 for validation & 784 for testing. |
|
|
|
### Training hyperparameters |
|
|
|
The following hyperparameters were used during training: |
|
- learning_rate: 5e-05 |
|
- train_batch_size: 32 |
|
- eval_batch_size: 32 |
|
- optimizer: default AdamW Optimizer |
|
- num_epochs: 10 |
|
- warmup_steps: 500 |
|
- weight_decay: 0.01 |
|
- random seed: 42 |
|
|
|
I also trained for 10 full epochs on Colab's Tesla P100-PCIE-16GB GPU. |
|
|
|
### Training results |
|
|
|
| Epoch | Training Loss | Validation Loss | |
|
|:-------------:|:----:|:---------------:| |
|
| 1 | 0.691600 |0.6738196 | |
|
| 2 | 0.621800 | 0.611911 | |
|
| 3 | 0.510800 | 0.516174 | |
|
| 4 | 0.384700 | 0.574607 | |
|
| 5 | 0.273900 | 0.644613 | |
|
| 6 | 0.162300 | 0.846262 | |
|
| 7 | 0.119000 | 0.869178 | |
|
| 8 | 0.079700 | 1.131574 | |
|
| 9 | 0.035800 | 1.5123457 | |
|
| 10 | 0.013600 |1.5706617 | |
|
|
|
## Model in Action ๐ |
|
```python |
|
from transformers import AutoTokenizer, AutoModelForSequenceClassification |
|
import torch.nn as nn |
|
tokenizer = AutoTokenizer.from_pretrained("vikram71198/roberta-base-finetuned-irony") |
|
model = AutoModelForSequenceClassification.from_pretrained("vikram71198/roberta-base-finetuned-irony") |
|
#Following the same truncation & padding strategy used while training |
|
encoded_input = tokenizer("Enter any text/tweet to be classified. Can input a list of tweets too.", padding = True, return_tensors='pt') |
|
output = model(**encoded_input)["logits"] |
|
#detaching the output from the computation graph |
|
detached_output = output.detach() |
|
#Applying softmax here for single label classification |
|
softmax = nn.Softmax(dim = 1) |
|
prediction_probabilities = list(softmax(detached_output).detach().numpy()) |
|
predictions = [] |
|
for x,y in prediction_probabilities: |
|
predictions.append("not_irony") if x > y else predictions.append("irony") |
|
print(predictions) |
|
``` |
|
Please note that if you're performing inference on a lengthy dataset, split it up into multiple batches, otherwise your RAM will overflow, unless you're using a really high end GPU/TPU setup. I'd recommend a batch length of 50, if you're working with a vanilla GPU setup. |
|
### Framework versions |
|
|
|
- Transformers 4.12.5 |
|
- Pytorch 1.11.0 |
|
- Datasets 1.17.0 |
|
- Tokenizers 0.10.3 |