metadata

license: mit
tags:
  - generated_from_trainer
metrics:
  - accuracy
  - f1
  - precision
  - recall
model-index:
  - name: roberta-finetuned-WebClassification
    results: []
pipeline_tag: text-classification

roberta-finetuned-WebClassification

This model is a fine-tuned version of xlm-roberta-base on the Web Classification Dataset. It achieves the following results on the evaluation set:

Loss: 0.3473
Accuracy: 0.9504
F1: 0.9504
Precision: 0.9504
Recall: 0.9504

Model description

The model classifies websites into the following categories:

"0": "Adult",
"1": "Business/Corporate",
"2": "Computers and Technology",
"3": "E-Commerce",
"4": "Education",
"5": "Food",
"6": "Forums",
"7": "Games",
"8": "Health and Fitness",
"9": "Law and Government",
"10": "News",
"11": "Photography",
"12": "Social Networking and Messaging",
"13": "Sports",
"14": "Streaming Services",
"15": "Travel"

Intended uses & limitations

Web classification in English (for now).

Training and evaluation data

Trained and tested on a 80/20 split of the Web Classification Dataset.

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 2e-05
train_batch_size: 8
eval_batch_size: 8
seed: 42
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
num_epochs: 10

Training results

Training Loss	Epoch	Step	Validation Loss	Accuracy	F1	Precision	Recall
No log	1.0	141	0.9315	0.8617	0.8617	0.8617	0.8617
No log	2.0	282	0.4956	0.9007	0.9007	0.9007	0.9007
No log	3.0	423	0.4142	0.9184	0.9184	0.9184	0.9184
0.9036	4.0	564	0.3998	0.9255	0.9255	0.9255	0.9255
0.9036	5.0	705	0.3235	0.9397	0.9397	0.9397	0.9397
0.9036	6.0	846	0.3631	0.9397	0.9397	0.9397	0.9397
0.9036	7.0	987	0.3705	0.9362	0.9362	0.9362	0.9362
0.0898	8.0	1128	0.3469	0.9468	0.9468	0.9468	0.9468
0.0898	9.0	1269	0.3657	0.9326	0.9326	0.9326	0.9326
0.0898	10.0	1410	0.3473	0.9504	0.9504	0.9504	0.9504

Framework versions

Transformers 4.16.2
Pytorch 1.9.1
Datasets 1.18.4
Tokenizers 0.11.6