Prikshit7766 commited on
Commit
0ed955f
·
verified ·
1 Parent(s): 2161ada

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +88 -77
README.md CHANGED
@@ -1,77 +1,88 @@
1
- # DistilBERT Fine-Tuned on IMDB for Masked Language Modeling
2
-
3
- ## Model Description
4
-
5
- This model is a fine-tuned version of [**`distilbert-base-uncased`**](https://huggingface.co/distilbert/distilbert-base-uncased) for the masked language modeling task. It has been trained on the IMDb dataset.
6
-
7
-
8
- ## Model Training Details
9
-
10
- ### Training Dataset
11
-
12
- - **Dataset:** [IMDB dataset](https://huggingface.co/datasets/imdb) from Hugging Face
13
- - **Dataset Split:**
14
- - Train: 25,000 samples
15
- - Test: 25,000 samples
16
- - Unsupervised: 50,000 samples
17
- - **Training and Unsupervised Data Concatenation:** Training performed on a combined dataset of train and unsupervised splits.
18
-
19
- ### Training Arguments
20
-
21
- The following parameters were used during fine-tuning:
22
-
23
- - **Number of Training Epochs:** `10`
24
- - **Overwrite Output Directory:** `True`
25
- - **Evaluation Strategy:** `steps`
26
- - **Evaluation Steps:** `500`
27
- - **Checkpoint Save Strategy:** `steps`
28
- - **Save Steps:** `500`
29
- - **Load Best Model at End:** `True`
30
- - **Metric for Best Model:** `eval_loss`
31
- - **Direction:** Lower `eval_loss` is better (`greater_is_better = False`).
32
- - **Learning Rate:** `2e-5`
33
- - **Weight Decay:** `0.01`
34
- - **Per-Device Batch Size (Training):** `32`
35
- - **Per-Device Batch Size (Evaluation):** `32`
36
- - **Warmup Steps:** `1,000`
37
- - **Mixed Precision Training:** Enabled (`fp16 = True`)
38
- - **Logging Steps:** `100`
39
- - **Gradient Accumulation Steps:** `2`
40
-
41
- ### Early Stopping
42
-
43
- - The model was configured with **early stopping** to prevent overfitting.
44
- - Training stopped after **5.87 epochs** (21,000 steps), as there was no significant improvement in `eval_loss`.
45
-
46
- ## Evaluation Results
47
-
48
- - **Metric Used:** `eval_loss`
49
- - **Final Perplexity:** `8.34`
50
- - **Best Checkpoint:** Model saved at the end of early stopping (step `21,000`).
51
-
52
- ## Model Usage
53
-
54
- The model can be used for masked language modeling tasks using the `fill-mask` pipeline from Hugging Face. Example:
55
-
56
- ```python
57
- from transformers import pipeline
58
-
59
- mask_filler = pipeline("fill-mask", model="Prikshit7766/distilbert-finetuned-imdb-mlm")
60
-
61
- text = "This is a great [MASK]."
62
- predictions = mask_filler(text)
63
-
64
- for pred in predictions:
65
- print(f">>> {pred['sequence']}")
66
- ```
67
-
68
- **Output Example:**
69
-
70
- ```text
71
- >>> This is a great movie.
72
- >>> This is a great film.
73
- >>> This is a great show.
74
- >>> This is a great documentary.
75
- >>> This is a great story.
76
- ```
77
-
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ datasets:
3
+ - stanfordnlp/imdb
4
+ language:
5
+ - en
6
+ metrics:
7
+ - perplexity
8
+ base_model:
9
+ - distilbert/distilbert-base-uncased
10
+ pipeline_tag: fill-mask
11
+ library_name: transformers
12
+ ---
13
+ # DistilBERT Fine-Tuned on IMDB for Masked Language Modeling
14
+
15
+ ## Model Description
16
+
17
+ This model is a fine-tuned version of [**`distilbert-base-uncased`**](https://huggingface.co/distilbert/distilbert-base-uncased) for the masked language modeling task. It has been trained on the IMDb dataset.
18
+
19
+
20
+ ## Model Training Details
21
+
22
+ ### Training Dataset
23
+
24
+ - **Dataset:** [IMDB dataset](https://huggingface.co/datasets/imdb) from Hugging Face
25
+ - **Dataset Split:**
26
+ - Train: 25,000 samples
27
+ - Test: 25,000 samples
28
+ - Unsupervised: 50,000 samples
29
+ - **Training and Unsupervised Data Concatenation:** Training performed on a combined dataset of train and unsupervised splits.
30
+
31
+ ### Training Arguments
32
+
33
+ The following parameters were used during fine-tuning:
34
+
35
+ - **Number of Training Epochs:** `10`
36
+ - **Overwrite Output Directory:** `True`
37
+ - **Evaluation Strategy:** `steps`
38
+ - **Evaluation Steps:** `500`
39
+ - **Checkpoint Save Strategy:** `steps`
40
+ - **Save Steps:** `500`
41
+ - **Load Best Model at End:** `True`
42
+ - **Metric for Best Model:** `eval_loss`
43
+ - **Direction:** Lower `eval_loss` is better (`greater_is_better = False`).
44
+ - **Learning Rate:** `2e-5`
45
+ - **Weight Decay:** `0.01`
46
+ - **Per-Device Batch Size (Training):** `32`
47
+ - **Per-Device Batch Size (Evaluation):** `32`
48
+ - **Warmup Steps:** `1,000`
49
+ - **Mixed Precision Training:** Enabled (`fp16 = True`)
50
+ - **Logging Steps:** `100`
51
+ - **Gradient Accumulation Steps:** `2`
52
+
53
+ ### Early Stopping
54
+
55
+ - The model was configured with **early stopping** to prevent overfitting.
56
+ - Training stopped after **5.87 epochs** (21,000 steps), as there was no significant improvement in `eval_loss`.
57
+
58
+ ## Evaluation Results
59
+
60
+ - **Metric Used:** `eval_loss`
61
+ - **Final Perplexity:** `8.34`
62
+ - **Best Checkpoint:** Model saved at the end of early stopping (step `21,000`).
63
+
64
+ ## Model Usage
65
+
66
+ The model can be used for masked language modeling tasks using the `fill-mask` pipeline from Hugging Face. Example:
67
+
68
+ ```python
69
+ from transformers import pipeline
70
+
71
+ mask_filler = pipeline("fill-mask", model="Prikshit7766/distilbert-finetuned-imdb-mlm")
72
+
73
+ text = "This is a great [MASK]."
74
+ predictions = mask_filler(text)
75
+
76
+ for pred in predictions:
77
+ print(f">>> {pred['sequence']}")
78
+ ```
79
+
80
+ **Output Example:**
81
+
82
+ ```text
83
+ >>> This is a great movie.
84
+ >>> This is a great film.
85
+ >>> This is a great show.
86
+ >>> This is a great documentary.
87
+ >>> This is a great story.
88
+ ```