metadata

title: TrueCheck - Fake News Detection
emoji: 📰
colorFrom: red
colorTo: blue
sdk: streamlit
sdk_version: 1.28.1
app_file: app.py
pinned: false
license: mit

TruthCheck: Fake News Detection with Fine-Tuned BERT

TruthCheck is an advanced fake news detection system leveraging a hybrid deep learning architecture. It combines a pre-trained BERT-base-uncased model with a BiLSTM and attention mechanism, fully fine-tuned on a curated dataset of real and fake news. The project includes robust preprocessing, feature extraction, model training, evaluation, and a Streamlit web app for interactive predictions.

🚀 Features

Hybrid Model: BERT-base-uncased + BiLSTM + Attention
Full Fine-Tuning: All layers of BERT and additional layers are trainable and optimized on the fake news dataset
Comprehensive Preprocessing: Cleaning, tokenization, lemmatization, and more
Training & Evaluation: Scripts for training, validation, and test evaluation
Interactive App: Streamlit web app for real-time news classification
Ready for Deployment: Easily extendable for research or production

🧠 Model Details

Base Model: BERT-base-uncased
Architecture:
- BERT encoder (pre-trained, all layers fine-tuned)
- BiLSTM layer for sequential context
- Attention mechanism for interpretability
- Fully connected classification head
Fine-Tuning Technique:
- All BERT layers are unfrozen and updated during training (full fine-tuning)
- Additional layers (BiLSTM, attention, classifier) are trained from scratch

📥 Download Data and Model

Raw and Processed Datasets:
Google Drive Link

Trained Model(s):
Google Drive Link

Instructions:

Download the datasets and place them in the data/ directory:
- data/raw/ for raw files
- data/processed/ for processed files
Download the trained model (e.g., final_model.pt or best_model.pt) and place it in models/saved/.

⚙️ Setup

Clone the repository:

git clone https://github.com/adnaan-tariq/fake-news-detection.git
cd fake-news-detection

Create and activate a virtual environment:

python -m venv venv
.\venv\Scripts\activate

Install dependencies:

pip install --upgrade pip
pip install -r requirements.txt

🏃‍♂️ Usage

Train the Model

If you want to train from scratch (after placing the data as described above):

python -m src.train

Run the Streamlit App

streamlit run app.py

Open http://localhost:8501 in your browser.

Test the Model

The app and scripts will use the model in models/saved/final_model.pt by default.
For custom inference, see the example in src/app.py or ask for a sample script.

📊 Results

Validation Accuracy: ~93%
Validation F1 Score: ~0.93
(See training logs and visualizations for more details.)

📦 Data & Model Policy

Data and model files are NOT included in this repository.
Please download them from the provided Google Drive links above.

🤝 Contributing

Pull requests and suggestions are welcome! For major changes, please open an issue first to discuss what you would like to change.

📄 License

This project is licensed under the MIT License.