DistilBERT Fine-Tuned on IMDB for Masked Language Modeling (Accelerate)
Model Description
This model is a fine-tuned version of distilbert-base-uncased
for the masked language modeling (MLM) task. It has been trained on the IMDb dataset using the Hugging Face 🤗 Accelerate library.
Model Training Details
Training Dataset
- Dataset: IMDB dataset from Hugging Face.
- Dataset Splits:
- Train: 25,000 samples
- Test: 25,000 samples
- Unsupervised: 50,000 samples
- Training Strategy:
- Combined the train and unsupervised splits for training, resulting in 75,000 training examples.
- Applied fixed random masking to the evaluation set to ensure consistent perplexity scores.
Training Configuration
The model was trained using the following parameters:
- Number of Training Epochs:
10
- Batch Size:
64
(per device). - Learning Rate:
5e-5
- Weight Decay:
0.01
- Evaluation Strategy: After each epoch.
- Early Stopping: Enabled (Patience =
3
). - Metric for Best Model:
eval_loss
- Direction: Lower
eval_loss
is better (greater_is_better = False
).
- Direction: Lower
- Learning Rate Scheduler: Linear decay with no warmup steps.
- Mixed Precision Training: Enabled (FP16).
Model Results
Best Epoch Performance
- Best Epoch:
9
- Loss:
2.0173
- Perplexity:
7.5178
Early Stopping
- The training ran for the full
10
epochs as the evaluation loss continued to improve.
Model Usage
This fine-tuned model can be used for masked language modeling tasks using the fill-mask
pipeline from Hugging Face. Below is an example:
from transformers import pipeline
mask_filler = pipeline("fill-mask", model="Prikshit7766/distilbert-finetuned-imdb-mlm-accelerate")
text = "This is a great [MASK]."
predictions = mask_filler(text)
for pred in predictions:
print(f">>> {pred['sequence']}")
Example Output:
>>> This is a great movie.
>>> This is a great film.
>>> This is a great show.
>>> This is a great story.
>>> This is a great documentary.
- Downloads last month
- 4
Model tree for Prikshit7766/distilbert-finetuned-imdb-mlm-accelerate
Base model
distilbert/distilbert-base-uncased