fact-check1-v1 / README.md

🏆 Upload elite fake news model - 99.98% accuracy!

a859d88 verified 10 days ago

4.23 kB

	---
	language: en
	license: mit
	tags:
	- fake-news-detection
	- deberta-v3-large
	- text-classification
	- binary-classification
	- news-classification
	datasets:
	- mrisdal/fake-news
	- jainpooja/fake-news-detection
	- clmentbisaillon/fake-and-real-news-dataset
	metrics:
	- accuracy
	- f1
	- precision
	- recall
	widget:
	- text: "Scientists announce breakthrough discovery of alien life on Mars!"
	example_title: "Suspicious Claim"
	- text: "The Federal Reserve announced a 0.25% interest rate increase following their monthly meeting."
	example_title: "Financial News"
	model-index:
	- name: Arko007/fact-check1-v1
	results:
	- task:
	type: text-classification
	name: Fake News Detection
	metrics:
	- type: accuracy
	value: 99.98
	name: Validation Accuracy
	- type: f1
	value: 99.98
	name: Validation F1-Score
	---
	# 🏆 Elite Fake News Detection Model

	## Model Description
	This is a state-of-the-art fake news detection model based on DeBERTa-v3-large, achieving 99.98% accuracy on validation data. The model was fine-tuned on a carefully curated and deduplicated dataset combining multiple high-quality fake news datasets, totaling 51,319 samples after preprocessing.

	## 🚀 Performance Highlights
	- Validation Accuracy: 99.98%
	- Test Accuracy: 99.94%
	- F1-Score: 99.98%
	- Precision: 99.97%
	- Recall: 100.00%

	## Model Architecture
	- Base Model: microsoft/deberta-v3-large
	- Task: Binary Text Classification (Real vs Fake News)
	- Parameters: ~400M parameters
	- Training Hardware: NVIDIA A100-SXM4-80GB

	## Training Details
	- Training Steps: 640
	- Batch Size: 64
	- Learning Rate: 3e-05
	- Max Length: 512 tokens
	- Training Time: 0.43 hours
	- Gradient Checkpointing: Non-reentrant (memory optimized)

	## Dataset Information
	Total Samples: 51,319
	- Training: 41,055 samples
	- Validation: 5,132 samples
	- Test: 5,132 samples
	- Fake News: 30,123 samples
	- Real News: 21,196 samples
	Source Datasets:
	- `mrisdal/fake-news`
	- `jainpooja/fake-news-detection`
	- `clmentbisaillon/fake-and-real-news-dataset`

	## Usage
	```python
	from transformers import AutoTokenizer, AutoModelForSequenceClassification
	import torch

	# Load model and tokenizer
	model_name = "Arko007/fact-check1-v1"
	tokenizer = AutoTokenizer.from_pretrained(model_name)
	model = AutoModelForSequenceClassification.from_pretrained(model_name)

	# Example prediction function
	def predict_fake_news(text):
	inputs = tokenizer(text, return_tensors="pt", truncation=True, max_length=512)
	with torch.no_grad():
	outputs = model(**inputs)
	probabilities = torch.nn.functional.softmax(outputs.logits, dim=-1)
	prediction = torch.argmax(probabilities, dim=-1).item()

	labels = {0: "REAL", 1: "FAKE"}
	confidence = probabilities[0][prediction].item()

	return {
	"prediction": labels[prediction],
	"confidence": confidence,
	"probabilities": {
	"REAL": probabilities[0][0].item(),
	"FAKE": probabilities[0][1].item()
	}
	}

	# Test the model
	text = "Breaking: Scientists discover new planet in our solar system!"
	result = predict_fake_news(text)
	print(f"Prediction: {result['prediction']} ({result['confidence']:.2%} confidence)")
	```
	## Model Performance

	This model achieves research-grade performance on fake news detection, with near-perfect accuracy across all metrics. The high precision and recall indicate excellent balance between catching fake news while avoiding false positives on real news.

	## Limitations and Bias

	- Trained primarily on English news articles
	- Performance may vary on news domains not represented in training data
	- May reflect biases present in the source datasets
	- Designed for binary classification (fake vs real) only

	## Citation
	```bibtex
	@misc{fake-news-deberta-2025,
	author = {Arko007},
	title = {Elite Fake News Detection with DeBERTa-v3-Large},
	year = {2025},
	publisher = {Hugging Face},
	url = {[https://huggingface.co/](https://huggingface.co/)Arko007/fact-check1-v1}
	}
	```
	## License
	MIT License - Feel free to use this model for research and applications.
	---
	Built with ❤️ using A100 80GB + DeBERTa-v3-Large