metadata
			library_name: transformers
license: apache-2.0
base_model: bert-base-uncased
tags:
  - paraphrase-detection
  - sentence-pair-classification
  - glue
  - mrpc
metrics:
  - accuracy
  - f1
model-index:
  - name: bert_paraphrase
    results:
      - task:
          name: Paraphrase Detection
          type: text-classification
        dataset:
          name: GLUE MRPC
          type: glue
          config: mrpc
          split: validation
        metrics:
          - name: Accuracy
            type: accuracy
            value: 0.8676
          - name: F1
            type: f1
            value: 0.9078
language:
  - en
bert_paraphrase
This model is a fine-tuned version of bert-base-uncased on the Microsoft Research Paraphrase Corpus (MRPC), a subset of the GLUE benchmark.
It is trained to determine whether two sentences are semantically equivalent (paraphrases) or not.
π Evaluation Results
- Loss: 0.4042
- Accuracy: 0.8676
- F1: 0.9078
π§Ύ Model Description
- Model type: BERT-base (uncased)
- Task: Binary classification (paraphrase vs not paraphrase)
- Languages: English
- Labels:  - 0β Not paraphrase
- 1β Paraphrase
 
β Intended Uses & Limitations
Intended uses & limitations
Intended uses
- Detect if two sentences convey the same meaning.
- Useful for:  - Duplicate question detection (e.g., Quora, FAQ bots).
- Semantic similarity search.
- Improving information retrieval systems.
 
Limitations
- Only trained on English (MRPC dataset).
- May not generalize well to other domains (e.g., legal, medical).
- Binary labels only (no "degree of similarity").
π How to Use
You can use this model with the Hugging Face pipeline for quick inference:
from transformers import pipeline
paraphrase_detector = pipeline(
    "text-classification",
    model="azherali/bert_paraphrase",
    tokenizer="azherali/bert_paraphrase"
)
single_pair = [
    {"text": "The car is red.", "text_pair": "The automobile is red."},
]
result = paraphrase_detector(single_pair)
print( result)
[{'label': 'paraphrase', 'score': 0.9801033139228821}]
# Test pairs
pairs = [
    {"text": "The car is red.", "text_pair": "The automobile is red."},
    {"text": "He enjoys playing football.", "text_pair": "She likes cooking."},
]
result = paraphrase_detector(pairs)
print( result)
[{'label': 'paraphrase', 'score': 0.9801033139228821}, {'label': 'not_paraphrase', 'score': 0.9302119016647339}]
Using AutoModel & AutoTokenizer:
from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch
tokenizer = AutoTokenizer.from_pretrained("azherali/bert_paraphrase")
model = AutoModelForSequenceClassification.from_pretrained("azherali/bert_paraphrase")
# Example sentences
sent1 = "The quick brown fox jumps over the lazy dog."
sent2 = "A fast brown fox leaps over a lazy dog."
inputs = tokenizer(sent1, sent2, return_tensors="pt")
outputs = model(**inputs)
logits = outputs.logits
predicted_class = torch.argmax(logits, dim=1).item()
print("Prediction:", model.config.id2label[predicted_class])
Prediction: paraphrase
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 2e-05
- train_batch_size: 16
- eval_batch_size: 16
- seed: 42
- optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: linear
- num_epochs: 3
Training results
| Training Loss | Epoch | Step | Validation Loss | Accuracy | F1 | 
|---|---|---|---|---|---|
| No log | 1.0 | 230 | 0.3894 | 0.8309 | 0.8836 | 
| No log | 2.0 | 460 | 0.3511 | 0.8505 | 0.8964 | 
| 0.4061 | 3.0 | 690 | 0.4042 | 0.8676 | 0.9078 | 
Framework versions
- Transformers 4.55.2
- Pytorch 2.8.0+cu126
- Datasets 4.0.0
- Tokenizers 0.21.4
