NaBI Model: Nepali Bias & Information Classifier

The NaBI Model is a text classifier for Nepali content, designed to automatically detect bias, misinformation, and hate speech. Trained on a balanced dataset created using oversampling techniques to address class imbalances in the real-world NaBI data, the model achieves 99% accuracy on this balanced split.

Overview

Task: Multi-Class Text Classification
Categories:
- Bias (editorial bias, user comment bias, etc.)
- Normal
- Misinformation
- Hate Speech
Model Performance:
Achieves 99% accuracy on a balanced dataset obtained via oversampling to mitigate class imbalance. Please note that further inference using the model on real-world data can help label additional biased and misinformation news, paving the way for continuous dataset expansion.
Dataset Details:
The dataset is derived from real-world Nepali content, which was originally imbalanced. Oversampling was used during training to ensure sufficient representation of underrepresented classes.
Real-World Implications and Future Work:
Although oversampling allowed the model to learn effectively from balanced data, the original dataset remains imbalanced. Further inference using this model on unlabeled real-world data (biased, misinformation news, etc.) can facilitate the creation of a larger, more diverse dataset over time.

Usage

Below is a simple example of how to use the NaBI Model with the Hugging Face Transformers library:

from transformers import pipeline

# Load the model
classifier = pipeline("text-classification", model="Utkarsha666/NaBI-Bert")

# Classify a sample Nepali text
sample_text = "यहाँ नेपालीमा तपाईंको पाठ राख्नुहोस्।"
result = classifier(sample_text)

print(result)

Utkarsha666
/

NaBI-Bert

You need to agree to share your contact information to access this model

NaBI Model: Nepali Bias & Information Classifier

Overview

Usage

Model tree for Utkarsha666/NaBI-Bert

Dataset used to train Utkarsha666/NaBI-Bert