--- license: mit tags: - generated_from_trainer metrics: - accuracy - f1 - precision - recall model-index: - name: roberta-finetuned-WebClassification results: [] pipeline_tag: text-classification --- # roberta-finetuned-WebClassification This model is a fine-tuned version of [xlm-roberta-base](https://huggingface.co/xlm-roberta-base) on the [Web Classification Dataset](https://www.kaggle.com/datasets/hetulmehta/website-classification). It achieves the following results on the evaluation set: - Loss: 0.3473 - Accuracy: 0.9504 - F1: 0.9504 - Precision: 0.9504 - Recall: 0.9504 ## Model description The model classifies websites into the following categories: - "0": "Adult", - "1": "Business/Corporate", - "2": "Computers and Technology", - "3": "E-Commerce", - "4": "Education", - "5": "Food", - "6": "Forums", - "7": "Games", - "8": "Health and Fitness", - "9": "Law and Government", - "10": "News", - "11": "Photography", - "12": "Social Networking and Messaging", - "13": "Sports", - "14": "Streaming Services", - "15": "Travel" ## Intended uses & limitations Web classification in English (for now). ## Training and evaluation data Trained and tested on a 80/20 split of the [Web Classification Dataset](https://www.kaggle.com/datasets/hetulmehta/website-classification). ## Training procedure ### Training hyperparameters The following hyperparameters were used during training: - learning_rate: 2e-05 - train_batch_size: 8 - eval_batch_size: 8 - seed: 42 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08 - lr_scheduler_type: linear - num_epochs: 10 ### Training results | Training Loss | Epoch | Step | Validation Loss | Accuracy | F1 | Precision | Recall | |:-------------:|:-----:|:----:|:---------------:|:--------:|:------:|:---------:|:------:| | No log | 1.0 | 141 | 0.9315 | 0.8617 | 0.8617 | 0.8617 | 0.8617 | | No log | 2.0 | 282 | 0.4956 | 0.9007 | 0.9007 | 0.9007 | 0.9007 | | No log | 3.0 | 423 | 0.4142 | 0.9184 | 0.9184 | 0.9184 | 0.9184 | | 0.9036 | 4.0 | 564 | 0.3998 | 0.9255 | 0.9255 | 0.9255 | 0.9255 | | 0.9036 | 5.0 | 705 | 0.3235 | 0.9397 | 0.9397 | 0.9397 | 0.9397 | | 0.9036 | 6.0 | 846 | 0.3631 | 0.9397 | 0.9397 | 0.9397 | 0.9397 | | 0.9036 | 7.0 | 987 | 0.3705 | 0.9362 | 0.9362 | 0.9362 | 0.9362 | | 0.0898 | 8.0 | 1128 | 0.3469 | 0.9468 | 0.9468 | 0.9468 | 0.9468 | | 0.0898 | 9.0 | 1269 | 0.3657 | 0.9326 | 0.9326 | 0.9326 | 0.9326 | | 0.0898 | 10.0 | 1410 | 0.3473 | 0.9504 | 0.9504 | 0.9504 | 0.9504 | ### Framework versions - Transformers 4.16.2 - Pytorch 1.9.1 - Datasets 1.18.4 - Tokenizers 0.11.6