--- datasets: - stanfordnlp/imdb language: - en metrics: - perplexity base_model: - distilbert/distilbert-base-uncased pipeline_tag: fill-mask library_name: transformers --- # DistilBERT Fine-Tuned on IMDB for Masked Language Modeling ## Model Description This model is a fine-tuned version of [**`distilbert-base-uncased`**](https://huggingface.co/distilbert/distilbert-base-uncased) for the masked language modeling task. It has been trained on the IMDb dataset. ## Model Training Details ### Training Dataset - **Dataset:** [IMDB dataset](https://huggingface.co/datasets/imdb) from Hugging Face - **Dataset Split:** - Train: 25,000 samples - Test: 25,000 samples - Unsupervised: 50,000 samples - **Training and Unsupervised Data Concatenation:** Training performed on a combined dataset of train and unsupervised splits. ### Training Arguments The following parameters were used during fine-tuning: - **Number of Training Epochs:** `10` - **Overwrite Output Directory:** `True` - **Evaluation Strategy:** `steps` - **Evaluation Steps:** `500` - **Checkpoint Save Strategy:** `steps` - **Save Steps:** `500` - **Load Best Model at End:** `True` - **Metric for Best Model:** `eval_loss` - **Direction:** Lower `eval_loss` is better (`greater_is_better = False`). - **Learning Rate:** `2e-5` - **Weight Decay:** `0.01` - **Per-Device Batch Size (Training):** `32` - **Per-Device Batch Size (Evaluation):** `32` - **Warmup Steps:** `1,000` - **Mixed Precision Training:** Enabled (`fp16 = True`) - **Logging Steps:** `100` - **Gradient Accumulation Steps:** `2` ### Early Stopping - The model was configured with **early stopping** to prevent overfitting. - Training stopped after **5.87 epochs** (21,000 steps), as there was no significant improvement in `eval_loss`. ## Evaluation Results - **Metric Used:** `eval_loss` - **Final Perplexity:** `8.34` - **Best Checkpoint:** Model saved at the end of early stopping (step `21,000`). ## Model Usage The model can be used for masked language modeling tasks using the `fill-mask` pipeline from Hugging Face. Example: ```python from transformers import pipeline mask_filler = pipeline("fill-mask", model="Prikshit7766/distilbert-finetuned-imdb-mlm") text = "This is a great [MASK]." predictions = mask_filler(text) for pred in predictions: print(f">>> {pred['sequence']}") ``` **Output Example:** ```text >>> This is a great movie. >>> This is a great film. >>> This is a great show. >>> This is a great documentary. >>> This is a great story. ```