AG-News BERT Classification

Model Details

Model Name: AG-News BERT Classification
Model Type: Text Classification
Developer: Mansoor Hamidzadeh
Repository: mansoorhamidzadeh/ag-news-bert-classification
Language(s): English
License: MIT

Model Description

Overview

The AG-News BERT Classification model is a fine-tuned BERT (Bidirectional Encoder Representations from Transformers) model designed for text classification tasks, specifically for classifying news articles into four categories: World, Sports, Business, and Sci/Tech. The model leverages the pre-trained BERT architecture, which has been fine-tuned on the AG-News dataset to optimize its performance for this specific task.

Intended Use

Primary Use Case

The primary use case for this model is to automatically classify news articles into one of the four predefined categories:

  • World
  • Sports
  • Business
  • Sci/Tech

This can be useful for news aggregation services, content recommendation systems, and any application that requires automated content categorization.

Applications

  • News aggregators and curators
  • Content recommendation engines
  • Media monitoring tools
  • Sentiment analysis and trend detection in news

Training Data

Dataset

  • Name: AG-News Dataset
  • Source: AG News Corpus
  • Description: The AG-News dataset is a widely used benchmark dataset for text classification. It contains 120,000 training samples and 7,600 test samples of news articles categorized into four classes: World, Sports, Business, and Sci/Tech.

Data Preprocessing

The text data was preprocessed to tokenize the sentences using the BERT tokenizer, converting the tokens to their corresponding IDs, and creating attention masks.

Training Procedure

Training Configuration:

  • Number of Epochs: 4
  • Batch Size: 8
  • Learning Rate: 1e-5
  • Optimizer: AdamW

Training and Validation Losses:

  • Epoch 1:
    • Average training loss: 0.1330
    • Average test loss: 0.1762
  • Epoch 2:
    • Average training loss: 0.0918
    • Average test loss: 0.1733
  • Epoch 3:
    • Average training loss: 0.0622
    • Average test loss: 0.1922
  • Epoch 4:
    • Average training loss: 0.0416
    • Average test loss: 0.2305

Hardware:

  • Training Environment: NVIDIA P100 GPU
  • Training Time: Approximately 3 hours

Performance

Evaluation Metrics

The model was evaluated using standard text classification metrics:

  • Accuracy
  • Precision
  • Recall
  • F1 Score

Results

On the AG-News test set, the model achieved the following performance:

  • Accuracy: 93.8%
  • Precision: 93.8%
  • Recall: 93.8%
  • F1 Score: 93.8%

Limitations and Biases

Limitations

  • The model may not generalize well to other text types or news sources outside the AG-News dataset.
  • Primarily designed for English text and may not perform well on text in other languages.

Biases

  • Potential biases present in the training data, reflecting biases in news reporting.
  • Category-specific biases due to the distribution of articles in the dataset.

Ethical Considerations

  • Ensure the model is used in compliance with user privacy and data security standards.
  • Be aware of potential biases and take steps to mitigate negative impacts, especially in sensitive applications.

How to Use

Inference

To use the model for inference, load it using the Hugging Face Transformers library:

from transformers import BertTokenizer, BertForSequenceClassification
from transformers import TextClassificationPipeline

tokenizer = BertTokenizer.from_pretrained("mansoorhamidzadeh/ag-news-bert-classification")
model = BertForSequenceClassification.from_pretrained("mansoorhamidzadeh/ag-news-bert-classification")

pipeline = TextClassificationPipeline(model=model, tokenizer=tokenizer)

text = "Sample news article text here."
prediction = pipeline(text)
print(prediction)


@misc{mansoorhamidzadeh,
  author = {Mansoor Hamidzadeh},
  title = {AG-News BERT Classification},
  year = {2024},
  publisher = {Hugging Face},
  howpublished = {\url{https://huggingface.co/mansoorhamidzadeh/ag-news-bert-classification}},
}
Downloads last month
18
Safetensors
Model size
109M params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Dataset used to train mansoorhamidzadeh/ag-news-bert-classification