metadata

language: en
license: mit
tags:
  - text-classification
  - fallacy-detection
  - logical-fallacies
  - argument-analysis
  - nlp
  - transformers
datasets:
  - custom
metrics:
  - accuracy: 1
  - f1: 1
model-index:
  - name: FallacyFinder
    results:
      - task:
          type: text-classification
          name: Fallacy Detection
        dataset:
          type: custom
          name: Balanced Fallacy Dataset
        metrics:
          - type: accuracy
            value: 1
          - type: f1
            value: 1
widget:
  - text: You're just a stupid liberal, so your opinion doesn't matter
    example_title: Ad Hominem Example
  - text: So you're saying we should let all criminals run free?
    example_title: Strawman Example
  - text: What about when you made the same mistake last year?
    example_title: Whataboutism Example
  - text: >-
      I understand your perspective, but here's why I disagree based on the
      evidence
    example_title: No Fallacy Example

FallacyFinder: Advanced Logical Fallacy Detection Model

Model Description

FallacyFinder is a state-of-the-art text classification model trained to detect 16 different types of logical fallacies in text. Built on DistilBERT architecture, this model achieves perfect accuracy in identifying argumentative fallacies and healthy logical discourse.

Supported Fallacy Types

The model can detect the following 16 categories:

Ad Hominem - Personal attacks instead of addressing arguments
Strawman - Misrepresenting someone's position to make it easier to attack
Whataboutism - Deflecting criticism by pointing to other issues
Gaslighting - Making someone question their own reality or memory
False Dichotomy - Presenting only two options when more exist
Appeal to Emotion - Using emotional manipulation instead of logical reasoning
DARVO - Deny, Attack, and Reverse Victim and Offender
Moving Goalposts - Changing the criteria for acceptance when challenged
Cherry Picking - Selecting only evidence that supports your position
Appeal to Authority - Inappropriate reliance on authority figures
Slippery Slope - Claiming that one event will lead to extreme consequences
Motte and Bailey - Defending a weak position by conflating it with a stronger one
Gish Gallop - Overwhelming opponents with many weak arguments
Kafkatrapping - Claiming that denial of guilt proves guilt
Sealioning - Persistent bad-faith requests for evidence
No Fallacy - Healthy, logical communication

Performance

Accuracy: 100% on test set
Average Confidence: 98.2%
Minimum Confidence: 77.1%
F1 Score: 1.0 (macro average)

Usage

from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch

# Load model and tokenizer
model_name = "SamanthaStorm/fallacyfinder"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForSequenceClassification.from_pretrained(model_name)

# Function to predict fallacy
def predict_fallacy(text):
    inputs = tokenizer(text, return_tensors="pt", truncation=True, padding=True, max_length=512)
    
    with torch.no_grad():
        outputs = model(**inputs)
        predictions = torch.nn.functional.softmax(outputs.logits, dim=-1)
        predicted_class_id = predictions.argmax().item()
        confidence = predictions.max().item()
    
    predicted_label = model.config.id2label[predicted_class_id]
    return predicted_label, confidence

# Example usage
text = "You're just being emotional and can't think rationally"
fallacy_type, confidence = predict_fallacy(text)
print(f"Fallacy Type: {fallacy_type}")
print(f"Confidence: {confidence:.3f}")

Training Data

The model was trained on a carefully curated dataset of 3,200 examples (200 per fallacy type) with high-quality, diverse examples covering:

Personal relationships
Political discourse
Workplace communication
Online discussions
Academic debates
Social media interactions

Model Architecture

Base Model: DistilBERT (distilbert-base-uncased)
Task: Multi-class text classification
Classes: 16 fallacy types
Max Sequence Length: 512 tokens
Training Epochs: 3
Batch Size: 16

Limitations and Considerations

Trained primarily on English text
Performance may vary on highly ambiguous or context-dependent cases
Best suited for clear argumentative text
May require fine-tuning for domain-specific applications

Citation

If you use this model in your research, please cite:

@misc{fallacyfinder2024,
  author = {SamanthaStorm},
  title = {FallacyFinder: Advanced Logical Fallacy Detection Model},
  year = {2024},
  publisher = {Hugging Face},
  url = {https://huggingface.co/SamanthaStorm/fallacyfinder}
}

License

This model is released under the MIT License.

Contact

For questions or issues, please open an issue on the model repository.