1. Model Details

Attribute	Value
Developed by	Petercusin (Guisheng Pan)
Model Architecture	fasttext
Number of words	267,799
Number of labels	42
Labels	['Arts', 'Arts & Culture', 'Black voices', 'Business', 'College', 'Comedy', 'Crime', 'Culture & Arts', 'Divorce', 'Education', 'Entertainment', 'Environment', 'Fifty', 'Food & Drink', 'Good news', 'Green', 'Healthy living', 'Home & Living', 'Impact', 'Latino voices', 'Media', 'Money', 'Parenting', 'Parents', 'Politics', 'Queer voices', 'Religion', 'Science', 'Sports', 'Style', 'Style & Beauty', 'Taste', 'Tech', 'The worldpost', 'Travel', 'U.S. news', 'Weddings', 'Weird news', 'Wellness', 'Women', 'World news', 'Worldpost']
Words/sec/thread	497,437
Learning rate	0.000000
Average loss	0.550502
Model saved to	PGS_news_classifier.bin (890M)
Precision@1	0.7414219768714605
Recall@1	0.7414219768714605
Number of examples	42,026 (test data)

2. Model Description

This model is designed to classify English news articles into various domains or categories. It can be used for tasks such as news categorization, content organization, and topic-based filtering.

⚙️3. How to Get Started with the Model

# -*- coding: utf-8 -*-
"""
Created on Sun Apr 20 20:24:58 2025

@author: Petercusin
"""

import fasttext
# pip install pgsfile
# https://pypi.org/project/PgsFile/
from PgsFile import predict_category 

# Load the trained model
model_path = "PGS_news_classifier.bin"
model = fasttext.load_model(model_path)

# New titles or texts to classify
new_titles = [
    "Stock markets reach all-time high amid economic recovery",
    "Scientists discover new species in Amazon rainforest",
    "Congress passes new bill on healthcare reforms",
    "The stairway to love: Chongqing's real-life fairy tale",
    "African delegation take in Shanghai sights on Huangpu cruise",
    "China expected to achieve higher grain output in 2025: report",
    "China continued its dominance at the 2025 World Aquatics Diving World Cup in Guadalajara, sweeping all four gold medals on the third day of competitions on Saturday, along with one silver.",
    "A 'DeepSeek moment for AI agents' as China launches Manus",
    "Developed by Monica.im, Manus achieved top scores on the GAIA (General AI Assistant) benchmark, exceeding those of OpenAI's GPT (generative pre-trained transformer) tools. GAIA is a real-world benchmark for general AI assistants.",
    "This week and without warning, a horrid video popped up on my phone. A puppy had its mouth and paws bound with tape, and was hanging in a plastic bag by the motorway. I immediately flicked past, but the image stayed with me. This was something I didn’t want to see, yet there it was at 11am on a Tuesday."
]

try:
    # Process new titles
    for title in new_titles:
        label, score = predict_category(model, title)
        print(f"Title/Text: {title}")
        print(f"Predicted category: <{label}> (confidence: {score:.2f})")
        print("*" * 50)  # Add a separator line for better readability
finally:
    # Explicitly delete the model to free up memory
    del model

Result

Title/Text: Stock markets reach all-time high amid economic recovery
Predicted category: <BUSINESS> (confidence: 1.00)
**************************************************
Title/Text: Scientists discover new species in Amazon rainforest
Predicted category: <SCIENCE> (confidence: 0.99)
**************************************************
Title/Text: Congress passes new bill on healthcare reforms
Predicted category: <POLITICS> (confidence: 1.00)
**************************************************
Title/Text: The stairway to love: Chongqing's real-life fairy tale
Predicted category: <ENTERTAINMENT> (confidence: 0.04)
**************************************************
Title/Text: African delegation take in Shanghai sights on Huangpu cruise
Predicted category: <TRAVEL> (confidence: 0.89)
**************************************************
Title/Text: China expected to achieve higher grain output in 2025: report
Predicted category: <WORLDPOST> (confidence: 0.01)
**************************************************
Title/Text: China continued its dominance at the 2025 World Aquatics Diving World Cup in Guadalajara, sweeping all four gold medals on the third day of competitions on Saturday, along with one silver.
Predicted category: <SPORTS> (confidence: 0.19)
**************************************************
Title/Text: A 'DeepSeek moment for AI agents' as China launches Manus
Predicted category: <TECH> (confidence: 0.99)
**************************************************
Title/Text: Developed by Monica.im, Manus achieved top scores on the GAIA (General AI Assistant) benchmark, exceeding those of OpenAI's GPT (generative pre-trained transformer) tools. GAIA is a real-world benchmark for general AI assistants.
Predicted category: <TECH> (confidence: 0.94)
**************************************************
Title/Text: This week and without warning, a horrid video popped up on my phone. A puppy had its mouth and paws bound with tape, and was hanging in a plastic bag by the motorway. I immediately flicked past, but the image stayed with me. This was something I didn’t want to see, yet there it was at 11am on a Tuesday.
Predicted category: <ENTERTAINMENT> (confidence: 0.04)
**************************************************

🤝 4. Model Card Contact

Author: Pan Guisheng, a PhD student at the Graduate Institute of Interpretation and Translation of Shanghai International Studies University Email: [email protected]

Downloads last month: -