1. Model Details
Attribute | Value |
---|---|
Developed by | Petercusin (Guisheng Pan) |
Model Architecture | fasttext |
Number of words | 267,799 |
Number of labels | 42 |
Labels | ['Arts', 'Arts & Culture', 'Black voices', 'Business', 'College', 'Comedy', 'Crime', 'Culture & Arts', 'Divorce', 'Education', 'Entertainment', 'Environment', 'Fifty', 'Food & Drink', 'Good news', 'Green', 'Healthy living', 'Home & Living', 'Impact', 'Latino voices', 'Media', 'Money', 'Parenting', 'Parents', 'Politics', 'Queer voices', 'Religion', 'Science', 'Sports', 'Style', 'Style & Beauty', 'Taste', 'Tech', 'The worldpost', 'Travel', 'U.S. news', 'Weddings', 'Weird news', 'Wellness', 'Women', 'World news', 'Worldpost'] |
Words/sec/thread | 497,437 |
Learning rate | 0.000000 |
Average loss | 0.550502 |
Model saved to | PGS_news_classifier.bin (890M) |
Precision@1 | 0.7414219768714605 |
Recall@1 | 0.7414219768714605 |
Number of examples | 42,026 (test data) |
2. Model Description
This model is designed to classify English news articles into various domains or categories. It can be used for tasks such as news categorization, content organization, and topic-based filtering.
โ๏ธ3. How to Get Started with the Model
# -*- coding: utf-8 -*-
"""
Created on Sun Apr 20 20:24:58 2025
@author: Petercusin
"""
import fasttext
# pip install pgsfile
# https://pypi.org/project/PgsFile/
from PgsFile import predict_category
# Load the trained model
model_path = "PGS_news_classifier.bin"
model = fasttext.load_model(model_path)
# New titles or texts to classify
new_titles = [
"Stock markets reach all-time high amid economic recovery",
"Scientists discover new species in Amazon rainforest",
"Congress passes new bill on healthcare reforms",
"The stairway to love: Chongqing's real-life fairy tale",
"African delegation take in Shanghai sights on Huangpu cruise",
"China expected to achieve higher grain output in 2025: report",
"China continued its dominance at the 2025 World Aquatics Diving World Cup in Guadalajara, sweeping all four gold medals on the third day of competitions on Saturday, along with one silver.",
"A 'DeepSeek moment for AI agents' as China launches Manus",
"Developed by Monica.im, Manus achieved top scores on the GAIA (General AI Assistant) benchmark, exceeding those of OpenAI's GPT (generative pre-trained transformer) tools. GAIA is a real-world benchmark for general AI assistants.",
"This week and without warning, a horrid video popped up on my phone. A puppy had its mouth and paws bound with tape, and was hanging in a plastic bag by the motorway. I immediately flicked past, but the image stayed with me. This was something I didnโt want to see, yet there it was at 11am on a Tuesday."
]
try:
# Process new titles
for title in new_titles:
label, score = predict_category(model, title)
print(f"Title/Text: {title}")
print(f"Predicted category: <{label}> (confidence: {score:.2f})")
print("*" * 50) # Add a separator line for better readability
finally:
# Explicitly delete the model to free up memory
del model
Result
Title/Text: Stock markets reach all-time high amid economic recovery
Predicted category: <BUSINESS> (confidence: 1.00)
**************************************************
Title/Text: Scientists discover new species in Amazon rainforest
Predicted category: <SCIENCE> (confidence: 0.99)
**************************************************
Title/Text: Congress passes new bill on healthcare reforms
Predicted category: <POLITICS> (confidence: 1.00)
**************************************************
Title/Text: The stairway to love: Chongqing's real-life fairy tale
Predicted category: <ENTERTAINMENT> (confidence: 0.04)
**************************************************
Title/Text: African delegation take in Shanghai sights on Huangpu cruise
Predicted category: <TRAVEL> (confidence: 0.89)
**************************************************
Title/Text: China expected to achieve higher grain output in 2025: report
Predicted category: <WORLDPOST> (confidence: 0.01)
**************************************************
Title/Text: China continued its dominance at the 2025 World Aquatics Diving World Cup in Guadalajara, sweeping all four gold medals on the third day of competitions on Saturday, along with one silver.
Predicted category: <SPORTS> (confidence: 0.19)
**************************************************
Title/Text: A 'DeepSeek moment for AI agents' as China launches Manus
Predicted category: <TECH> (confidence: 0.99)
**************************************************
Title/Text: Developed by Monica.im, Manus achieved top scores on the GAIA (General AI Assistant) benchmark, exceeding those of OpenAI's GPT (generative pre-trained transformer) tools. GAIA is a real-world benchmark for general AI assistants.
Predicted category: <TECH> (confidence: 0.94)
**************************************************
Title/Text: This week and without warning, a horrid video popped up on my phone. A puppy had its mouth and paws bound with tape, and was hanging in a plastic bag by the motorway. I immediately flicked past, but the image stayed with me. This was something I didnโt want to see, yet there it was at 11am on a Tuesday.
Predicted category: <ENTERTAINMENT> (confidence: 0.04)
**************************************************
๐ค 4. Model Card Contact
Author: Pan Guisheng, a PhD student at the Graduate Institute of Interpretation and Translation of Shanghai International Studies University Email: [email protected]
- Downloads last month
- -
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
๐
Ask for provider support