Prikshit7766
/

distilbert-finetuned-imdb-mlm

@@ -1,77 +1,88 @@
-# DistilBERT Fine-Tuned on IMDB for Masked Language Modeling
-## Model Description
-This model is a fine-tuned version of [**`distilbert-base-uncased`**](https://huggingface.co/distilbert/distilbert-base-uncased) for the masked language modeling task. It has been trained on the IMDb dataset.
-## Model Training Details
-### Training Dataset
-- **Dataset:** [IMDB dataset](https://huggingface.co/datasets/imdb) from Hugging Face
-- **Dataset Split:**
-  - Train: 25,000 samples
-  - Test: 25,000 samples
-  - Unsupervised: 50,000 samples
-- **Training and Unsupervised Data Concatenation:** Training performed on a combined dataset of train and unsupervised splits.
-### Training Arguments
-The following parameters were used during fine-tuning:
-- **Number of Training Epochs:** `10`
-- **Overwrite Output Directory:** `True`
-- **Evaluation Strategy:** `steps`
-  - **Evaluation Steps:** `500`
-- **Checkpoint Save Strategy:** `steps`
-  - **Save Steps:** `500`
-- **Load Best Model at End:** `True`
-- **Metric for Best Model:** `eval_loss`
-  - **Direction:** Lower `eval_loss` is better (`greater_is_better = False`).
-- **Learning Rate:** `2e-5`
-- **Weight Decay:** `0.01`
-- **Per-Device Batch Size (Training):** `32`
-- **Per-Device Batch Size (Evaluation):** `32`
-- **Warmup Steps:** `1,000`
-- **Mixed Precision Training:** Enabled (`fp16 = True`)
-- **Logging Steps:** `100`
-- **Gradient Accumulation Steps:** `2`
-### Early Stopping
-- The model was configured with **early stopping** to prevent overfitting.
-- Training stopped after **5.87 epochs** (21,000 steps), as there was no significant improvement in `eval_loss`.
-## Evaluation Results
-- **Metric Used:** `eval_loss`
-- **Final Perplexity:** `8.34`
-- **Best Checkpoint:** Model saved at the end of early stopping (step `21,000`).
-## Model Usage
-The model can be used for masked language modeling tasks using the `fill-mask` pipeline from Hugging Face. Example:
-```python
-from transformers import pipeline
-mask_filler = pipeline("fill-mask", model="Prikshit7766/distilbert-finetuned-imdb-mlm")
-text = "This is a great [MASK]."
-predictions = mask_filler(text)
-for pred in predictions:
-    print(f">>> {pred['sequence']}")
-```
-**Output Example:**
-```text
->>> This is a great movie.
->>> This is a great film.
->>> This is a great show.
->>> This is a great documentary.
->>> This is a great story.
-```

+---
+datasets:
+- stanfordnlp/imdb
+language:
+- en
+metrics:
+- perplexity
+base_model:
+- distilbert/distilbert-base-uncased
+pipeline_tag: fill-mask
+library_name: transformers
+---
+# DistilBERT Fine-Tuned on IMDB for Masked Language Modeling
+## Model Description
+This model is a fine-tuned version of [**`distilbert-base-uncased`**](https://huggingface.co/distilbert/distilbert-base-uncased) for the masked language modeling task. It has been trained on the IMDb dataset.
+## Model Training Details
+### Training Dataset
+- **Dataset:** [IMDB dataset](https://huggingface.co/datasets/imdb) from Hugging Face
+- **Dataset Split:**
+  - Train: 25,000 samples
+  - Test: 25,000 samples
+  - Unsupervised: 50,000 samples
+- **Training and Unsupervised Data Concatenation:** Training performed on a combined dataset of train and unsupervised splits.
+### Training Arguments
+The following parameters were used during fine-tuning:
+- **Number of Training Epochs:** `10`
+- **Overwrite Output Directory:** `True`
+- **Evaluation Strategy:** `steps`
+  - **Evaluation Steps:** `500`
+- **Checkpoint Save Strategy:** `steps`
+  - **Save Steps:** `500`
+- **Load Best Model at End:** `True`
+- **Metric for Best Model:** `eval_loss`
+  - **Direction:** Lower `eval_loss` is better (`greater_is_better = False`).
+- **Learning Rate:** `2e-5`
+- **Weight Decay:** `0.01`
+- **Per-Device Batch Size (Training):** `32`
+- **Per-Device Batch Size (Evaluation):** `32`
+- **Warmup Steps:** `1,000`
+- **Mixed Precision Training:** Enabled (`fp16 = True`)
+- **Logging Steps:** `100`
+- **Gradient Accumulation Steps:** `2`
+### Early Stopping
+- The model was configured with **early stopping** to prevent overfitting.
+- Training stopped after **5.87 epochs** (21,000 steps), as there was no significant improvement in `eval_loss`.
+## Evaluation Results
+- **Metric Used:** `eval_loss`
+- **Final Perplexity:** `8.34`
+- **Best Checkpoint:** Model saved at the end of early stopping (step `21,000`).
+## Model Usage
+The model can be used for masked language modeling tasks using the `fill-mask` pipeline from Hugging Face. Example:
+```python
+from transformers import pipeline
+mask_filler = pipeline("fill-mask", model="Prikshit7766/distilbert-finetuned-imdb-mlm")
+text = "This is a great [MASK]."
+predictions = mask_filler(text)
+for pred in predictions:
+    print(f">>> {pred['sequence']}")
+```
+**Output Example:**
+```text
+>>> This is a great movie.
+>>> This is a great film.
+>>> This is a great show.
+>>> This is a great documentary.
+>>> This is a great story.
+```