Amal17 commited on
Commit
4eac93b
·
verified ·
1 Parent(s): a4942a2

Add Metadata

Browse files
Files changed (1) hide show
  1. README.md +182 -171
README.md CHANGED
@@ -1,171 +1,182 @@
1
- # BERT + BiLSTM Model for Sequence Classification
2
-
3
- ## Overview
4
-
5
- This repository contains a BERT-based model enhanced with a BiLSTM layer for sequence classification tasks. The model allows you to leverage the power of a pre-trained BERT model, combined with the benefits of a BiLSTM, to handle sequence-level tasks like sentiment analysis, text classification, and more.
6
-
7
- ## Features:
8
- - **Pre-trained BERT model**: Leverage BERT's embeddings for robust language understanding.
9
- - **BiLSTM layer**: Capture sequential dependencies in both directions (forward and backward).
10
- - **Customizable freezing of BERT layers**: Choose which layers of the BERT model you want to freeze, and whether to freeze from the start or the end.
11
- - **Inference without labels**: Get logits directly for inference in production, with no need for labels.
12
- - **Logging for better debugging**: Includes logging for important events like model initialization, layer freezing, and inference.
13
-
14
- ## Installation:
15
-
16
- 1. Install the necessary dependencies:
17
- ```bash
18
- pip install transformers torch
19
- ```
20
-
21
- 2. Clone this repository and navigate to the project folder:
22
- ```bash
23
- git clone <repository-url>
24
- cd <project-folder>
25
- ```
26
-
27
- ## Configuration:
28
-
29
- The model's behavior can be customized using the following configuration options:
30
-
31
- - **`freeze_bert`**: If `True`, the BERT model's layers will be frozen according to the specified settings.
32
- - **`freeze_n_layers`**: An integer that defines the number of layers to freeze.
33
- - **`freeze_from_start`**: If `True`, freeze the first `n` layers from the start; if `False`, freeze the last `n` layers from the end.
34
- - **`concat_layers`**: Number of BERT layers to concatenate for the final sequence output.
35
- - **`pooling`**: Type of pooling to apply. Options: `'last'`, `'mean'`, etc.
36
-
37
- Example usage for configuring the model:
38
-
39
- ```python
40
- from transformers import BertTokenizer
41
- from modeling_bert_bilstm import BertBiLSTMForSequenceClassification, BertBiLSTMConfig
42
-
43
- # Configure the model
44
- config = BertBiLSTMConfig(
45
- bert_model_name="bert-base-uncased",
46
- freeze_bert=True,
47
- freeze_n_layers=10,
48
- freeze_from_start=False # Freeze the last 10 layers
49
- )
50
-
51
- # Initialize the model
52
- model = BertBiLSTMForSequenceClassification(config)
53
-
54
- # Print model's freeze summary
55
- freeze_summary = model.get_freeze_summary()
56
- print(freeze_summary)
57
- ```
58
-
59
- ## Training the Model:
60
-
61
- To train the model, you need to prepare your dataset and use standard PyTorch training loops. Here’s an outline of how you might train the model:
62
-
63
- ```python
64
- from torch.utils.data import DataLoader
65
- from transformers import AdamW
66
- import torch
67
-
68
- # Create DataLoader, model, optimizer, etc.
69
- train_dataloader = DataLoader(train_dataset, batch_size=32, shuffle=True)
70
- optimizer = AdamW(model.parameters(), lr=1e-5)
71
-
72
- for epoch in range(num_epochs):
73
- model.train()
74
- for batch in train_dataloader:
75
- input_ids = batch["input_ids"]
76
- attention_mask = batch["attention_mask"]
77
- labels = batch["labels"]
78
-
79
- optimizer.zero_grad()
80
- output = model(input_ids=input_ids, attention_mask=attention_mask, labels=labels)
81
- loss = output["loss"]
82
- loss.backward()
83
- optimizer.step()
84
- ```
85
-
86
- ## Inference (Prediction without Labels):
87
-
88
- For serving the model in production, the model can be used for inference without needing labels.
89
-
90
- ### Example Forward Pass for Inference:
91
-
92
- ```python
93
- import torch
94
-
95
- # Example input (input_ids, attention_mask)
96
- input_ids = torch.tensor([[101, 2054, 2003, 102]]) # Example tokenized input
97
- attention_mask = torch.tensor([[1, 1, 1, 1]]) # Example attention mask
98
-
99
- # Get logits for prediction (no labels required)
100
- logits = model(input_ids=input_ids, attention_mask=attention_mask)
101
- print(logits)
102
- ```
103
-
104
- ### Logging:
105
-
106
- This model includes logging to help with debugging and monitoring during training and inference. Logs include information such as:
107
- - Initialization of the BERT model.
108
- - Freezing layers.
109
- - Inference start and completion.
110
-
111
- To configure logging:
112
-
113
- ```python
114
- import logging
115
-
116
- # Set up logging
117
- logging.basicConfig(level=logging.INFO,
118
- format='%(asctime)s - %(levelname)s - %(message)s',
119
- handlers=[logging.StreamHandler()])
120
-
121
- logger = logging.getLogger(__name__)
122
-
123
- # Example log messages
124
- logger.info("Model initialized with BERT model: %s", config.bert_model_name)
125
- logger.info(f"Freezing the top {config.freeze_n_layers} layers of BERT.")
126
- ```
127
-
128
- ## Model Freezing Configuration:
129
-
130
- You can customize which layers of BERT to freeze. The `freeze_n_layers` parameter allows you to freeze a specific number of layers either from the start or the end of the BERT model:
131
-
132
- - **`freeze_from_start=True`**: Freeze the first `n` layers.
133
- - **`freeze_from_start=False`**: Freeze the last `n` layers.
134
-
135
- ### Example of Freezing Layers:
136
-
137
- ```python
138
- config = BertBiLSTMConfig(
139
- freeze_bert=True,
140
- freeze_n_layers=10, # Freeze the last 10 layers
141
- freeze_from_start=False # Freeze from the end
142
- )
143
- ```
144
-
145
- ## Model Summary:
146
-
147
- You can view a summary of which layers are frozen and which are trainable by using the `get_freeze_summary()` method:
148
-
149
- ```python
150
- freeze_summary = model.get_freeze_summary()
151
- print(freeze_summary)
152
- ```
153
-
154
- Example output:
155
-
156
- ```python
157
- [
158
- {"layer": "bert.encoder.layer.0", "trainable": False},
159
- {"layer": "bert.encoder.layer.1", "trainable": False},
160
- {"layer": "bert.encoder.layer.2", "trainable": True},
161
- {"layer": "bert.encoder.layer.3", "trainable": True},
162
- ...
163
- ]
164
- ```
165
-
166
- ## Notes:
167
- - This model is production-ready for serving via APIs like **FastAPI** or **Flask** for real-time predictions.
168
- - Make sure to handle logging and exception management properly in production.
169
-
170
- ## License:
171
- This repository is licensed under the MIT License. See the LICENSE file for more information.
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ datasets:
4
+ - indonlp/NusaX-senti
5
+ metrics: macro-f1
6
+ base_model:
7
+ - LazarusNLP/NusaBERT-large
8
+ pipeline_tag: text-classification
9
+ language:
10
+ - ace
11
+ ---
12
+ # BERT + BiLSTM Model for Sequence Classification
13
+
14
+ ## Overview
15
+
16
+ This repository contains a BERT-based model enhanced with a BiLSTM layer for sequence classification tasks. The model allows you to leverage the power of a pre-trained BERT model, combined with the benefits of a BiLSTM, to handle sequence-level tasks like sentiment analysis, text classification, and more.
17
+
18
+ ## Features:
19
+ - **Pre-trained BERT model**: Leverage BERT's embeddings for robust language understanding.
20
+ - **BiLSTM layer**: Capture sequential dependencies in both directions (forward and backward).
21
+ - **Customizable freezing of BERT layers**: Choose which layers of the BERT model you want to freeze, and whether to freeze from the start or the end.
22
+ - **Inference without labels**: Get logits directly for inference in production, with no need for labels.
23
+ - **Logging for better debugging**: Includes logging for important events like model initialization, layer freezing, and inference.
24
+
25
+ ## Installation:
26
+
27
+ 1. Install the necessary dependencies:
28
+ ```bash
29
+ pip install transformers torch
30
+ ```
31
+
32
+ 2. Clone this repository and navigate to the project folder:
33
+ ```bash
34
+ git clone <repository-url>
35
+ cd <project-folder>
36
+ ```
37
+
38
+ ## Configuration:
39
+
40
+ The model's behavior can be customized using the following configuration options:
41
+
42
+ - **`freeze_bert`**: If `True`, the BERT model's layers will be frozen according to the specified settings.
43
+ - **`freeze_n_layers`**: An integer that defines the number of layers to freeze.
44
+ - **`freeze_from_start`**: If `True`, freeze the first `n` layers from the start; if `False`, freeze the last `n` layers from the end.
45
+ - **`concat_layers`**: Number of BERT layers to concatenate for the final sequence output.
46
+ - **`pooling`**: Type of pooling to apply. Options: `'last'`, `'mean'`, etc.
47
+
48
+ Example usage for configuring the model:
49
+
50
+ ```python
51
+ from transformers import BertTokenizer
52
+ from modeling_bert_bilstm import BertBiLSTMForSequenceClassification, BertBiLSTMConfig
53
+
54
+ # Configure the model
55
+ config = BertBiLSTMConfig(
56
+ bert_model_name="bert-base-uncased",
57
+ freeze_bert=True,
58
+ freeze_n_layers=10,
59
+ freeze_from_start=False # Freeze the last 10 layers
60
+ )
61
+
62
+ # Initialize the model
63
+ model = BertBiLSTMForSequenceClassification(config)
64
+
65
+ # Print model's freeze summary
66
+ freeze_summary = model.get_freeze_summary()
67
+ print(freeze_summary)
68
+ ```
69
+
70
+ ## Training the Model:
71
+
72
+ To train the model, you need to prepare your dataset and use standard PyTorch training loops. Here’s an outline of how you might train the model:
73
+
74
+ ```python
75
+ from torch.utils.data import DataLoader
76
+ from transformers import AdamW
77
+ import torch
78
+
79
+ # Create DataLoader, model, optimizer, etc.
80
+ train_dataloader = DataLoader(train_dataset, batch_size=32, shuffle=True)
81
+ optimizer = AdamW(model.parameters(), lr=1e-5)
82
+
83
+ for epoch in range(num_epochs):
84
+ model.train()
85
+ for batch in train_dataloader:
86
+ input_ids = batch["input_ids"]
87
+ attention_mask = batch["attention_mask"]
88
+ labels = batch["labels"]
89
+
90
+ optimizer.zero_grad()
91
+ output = model(input_ids=input_ids, attention_mask=attention_mask, labels=labels)
92
+ loss = output["loss"]
93
+ loss.backward()
94
+ optimizer.step()
95
+ ```
96
+
97
+ ## Inference (Prediction without Labels):
98
+
99
+ For serving the model in production, the model can be used for inference without needing labels.
100
+
101
+ ### Example Forward Pass for Inference:
102
+
103
+ ```python
104
+ import torch
105
+
106
+ # Example input (input_ids, attention_mask)
107
+ input_ids = torch.tensor([[101, 2054, 2003, 102]]) # Example tokenized input
108
+ attention_mask = torch.tensor([[1, 1, 1, 1]]) # Example attention mask
109
+
110
+ # Get logits for prediction (no labels required)
111
+ logits = model(input_ids=input_ids, attention_mask=attention_mask)
112
+ print(logits)
113
+ ```
114
+
115
+ ### Logging:
116
+
117
+ This model includes logging to help with debugging and monitoring during training and inference. Logs include information such as:
118
+ - Initialization of the BERT model.
119
+ - Freezing layers.
120
+ - Inference start and completion.
121
+
122
+ To configure logging:
123
+
124
+ ```python
125
+ import logging
126
+
127
+ # Set up logging
128
+ logging.basicConfig(level=logging.INFO,
129
+ format='%(asctime)s - %(levelname)s - %(message)s',
130
+ handlers=[logging.StreamHandler()])
131
+
132
+ logger = logging.getLogger(__name__)
133
+
134
+ # Example log messages
135
+ logger.info("Model initialized with BERT model: %s", config.bert_model_name)
136
+ logger.info(f"Freezing the top {config.freeze_n_layers} layers of BERT.")
137
+ ```
138
+
139
+ ## Model Freezing Configuration:
140
+
141
+ You can customize which layers of BERT to freeze. The `freeze_n_layers` parameter allows you to freeze a specific number of layers either from the start or the end of the BERT model:
142
+
143
+ - **`freeze_from_start=True`**: Freeze the first `n` layers.
144
+ - **`freeze_from_start=False`**: Freeze the last `n` layers.
145
+
146
+ ### Example of Freezing Layers:
147
+
148
+ ```python
149
+ config = BertBiLSTMConfig(
150
+ freeze_bert=True,
151
+ freeze_n_layers=10, # Freeze the last 10 layers
152
+ freeze_from_start=False # Freeze from the end
153
+ )
154
+ ```
155
+
156
+ ## Model Summary:
157
+
158
+ You can view a summary of which layers are frozen and which are trainable by using the `get_freeze_summary()` method:
159
+
160
+ ```python
161
+ freeze_summary = model.get_freeze_summary()
162
+ print(freeze_summary)
163
+ ```
164
+
165
+ Example output:
166
+
167
+ ```python
168
+ [
169
+ {"layer": "bert.encoder.layer.0", "trainable": False},
170
+ {"layer": "bert.encoder.layer.1", "trainable": False},
171
+ {"layer": "bert.encoder.layer.2", "trainable": True},
172
+ {"layer": "bert.encoder.layer.3", "trainable": True},
173
+ ...
174
+ ]
175
+ ```
176
+
177
+ ## Notes:
178
+ - This model is production-ready for serving via APIs like **FastAPI** or **Flask** for real-time predictions.
179
+ - Make sure to handle logging and exception management properly in production.
180
+
181
+ ## License:
182
+ This repository is licensed under the MIT License. See the LICENSE file for more information.