Amal17
/

NusaBERT-concate-BiGRU-NusaX-ace

Text Classification

Safetensors

Achinese

bert-bigru

Model card Files Files and versions Community

Amal17 commited on 26 days ago

Commit

4eac93b

verified ·

1 Parent(s): a4942a2

Add Metadata

Browse files

Files changed (1) hide show

README.md +182 -171

README.md CHANGED Viewed

@@ -1,171 +1,182 @@
-# BERT + BiLSTM Model for Sequence Classification
-## Overview
-This repository contains a BERT-based model enhanced with a BiLSTM layer for sequence classification tasks. The model allows you to leverage the power of a pre-trained BERT model, combined with the benefits of a BiLSTM, to handle sequence-level tasks like sentiment analysis, text classification, and more.
-## Features:
-- **Pre-trained BERT model**: Leverage BERT's embeddings for robust language understanding.
-- **BiLSTM layer**: Capture sequential dependencies in both directions (forward and backward).
-- **Customizable freezing of BERT layers**: Choose which layers of the BERT model you want to freeze, and whether to freeze from the start or the end.
-- **Inference without labels**: Get logits directly for inference in production, with no need for labels.
-- **Logging for better debugging**: Includes logging for important events like model initialization, layer freezing, and inference.
-## Installation:
-1. Install the necessary dependencies:
-   ```bash
-   pip install transformers torch
-   ```
-2. Clone this repository and navigate to the project folder:
-   ```bash
-   git clone <repository-url>
-   cd <project-folder>
-   ```
-## Configuration:
-The model's behavior can be customized using the following configuration options:
-- **`freeze_bert`**: If `True`, the BERT model's layers will be frozen according to the specified settings.
-- **`freeze_n_layers`**: An integer that defines the number of layers to freeze.
-- **`freeze_from_start`**: If `True`, freeze the first `n` layers from the start; if `False`, freeze the last `n` layers from the end.
-- **`concat_layers`**: Number of BERT layers to concatenate for the final sequence output.
-- **`pooling`**: Type of pooling to apply. Options: `'last'`, `'mean'`, etc.
-Example usage for configuring the model:
-```python
-from transformers import BertTokenizer
-from modeling_bert_bilstm import BertBiLSTMForSequenceClassification, BertBiLSTMConfig
-# Configure the model
-config = BertBiLSTMConfig(
-    bert_model_name="bert-base-uncased",
-    freeze_bert=True,
-    freeze_n_layers=10,
-    freeze_from_start=False  # Freeze the last 10 layers
-)
-# Initialize the model
-model = BertBiLSTMForSequenceClassification(config)
-# Print model's freeze summary
-freeze_summary = model.get_freeze_summary()
-print(freeze_summary)
-```
-## Training the Model:
-To train the model, you need to prepare your dataset and use standard PyTorch training loops. Here’s an outline of how you might train the model:
-```python
-from torch.utils.data import DataLoader
-from transformers import AdamW
-import torch
-# Create DataLoader, model, optimizer, etc.
-train_dataloader = DataLoader(train_dataset, batch_size=32, shuffle=True)
-optimizer = AdamW(model.parameters(), lr=1e-5)
-for epoch in range(num_epochs):
-    model.train()
-    for batch in train_dataloader:
-        input_ids = batch["input_ids"]
-        attention_mask = batch["attention_mask"]
-        labels = batch["labels"]
-        optimizer.zero_grad()
-        output = model(input_ids=input_ids, attention_mask=attention_mask, labels=labels)
-        loss = output["loss"]
-        loss.backward()
-        optimizer.step()
-```
-## Inference (Prediction without Labels):
-For serving the model in production, the model can be used for inference without needing labels.
-### Example Forward Pass for Inference:
-```python
-import torch
-# Example input (input_ids, attention_mask)
-input_ids = torch.tensor([[101, 2054, 2003, 102]])  # Example tokenized input
-attention_mask = torch.tensor([[1, 1, 1, 1]])       # Example attention mask
-# Get logits for prediction (no labels required)
-logits = model(input_ids=input_ids, attention_mask=attention_mask)
-print(logits)
-```
-### Logging:
-This model includes logging to help with debugging and monitoring during training and inference. Logs include information such as:
-- Initialization of the BERT model.
-- Freezing layers.
-- Inference start and completion.
-To configure logging:
-```python
-import logging
-# Set up logging
-logging.basicConfig(level=logging.INFO,
-                    format='%(asctime)s - %(levelname)s - %(message)s',
-                    handlers=[logging.StreamHandler()])
-logger = logging.getLogger(__name__)
-# Example log messages
-logger.info("Model initialized with BERT model: %s", config.bert_model_name)
-logger.info(f"Freezing the top {config.freeze_n_layers} layers of BERT.")
-```
-## Model Freezing Configuration:
-You can customize which layers of BERT to freeze. The `freeze_n_layers` parameter allows you to freeze a specific number of layers either from the start or the end of the BERT model:
-- **`freeze_from_start=True`**: Freeze the first `n` layers.
-- **`freeze_from_start=False`**: Freeze the last `n` layers.
-### Example of Freezing Layers:
-```python
-config = BertBiLSTMConfig(
-    freeze_bert=True,
-    freeze_n_layers=10,  # Freeze the last 10 layers
-    freeze_from_start=False  # Freeze from the end
-)
-```
-## Model Summary:
-You can view a summary of which layers are frozen and which are trainable by using the `get_freeze_summary()` method:
-```python
-freeze_summary = model.get_freeze_summary()
-print(freeze_summary)
-```
-Example output:
-```python
-[
-  {"layer": "bert.encoder.layer.0", "trainable": False},
-  {"layer": "bert.encoder.layer.1", "trainable": False},
-  {"layer": "bert.encoder.layer.2", "trainable": True},
-  {"layer": "bert.encoder.layer.3", "trainable": True},
-  ...
-]
-```
-## Notes:
-- This model is production-ready for serving via APIs like **FastAPI** or **Flask** for real-time predictions.
-- Make sure to handle logging and exception management properly in production.
-## License:
-This repository is licensed under the MIT License. See the LICENSE file for more information.

+---
+license: apache-2.0
+datasets:
+- indonlp/NusaX-senti
+metrics: macro-f1
+base_model:
+- LazarusNLP/NusaBERT-large
+pipeline_tag: text-classification
+language:
+- ace
+---
+# BERT + BiLSTM Model for Sequence Classification
+## Overview
+This repository contains a BERT-based model enhanced with a BiLSTM layer for sequence classification tasks. The model allows you to leverage the power of a pre-trained BERT model, combined with the benefits of a BiLSTM, to handle sequence-level tasks like sentiment analysis, text classification, and more.
+## Features:
+- **Pre-trained BERT model**: Leverage BERT's embeddings for robust language understanding.
+- **BiLSTM layer**: Capture sequential dependencies in both directions (forward and backward).
+- **Customizable freezing of BERT layers**: Choose which layers of the BERT model you want to freeze, and whether to freeze from the start or the end.
+- **Inference without labels**: Get logits directly for inference in production, with no need for labels.
+- **Logging for better debugging**: Includes logging for important events like model initialization, layer freezing, and inference.
+## Installation:
+1. Install the necessary dependencies:
+   ```bash
+   pip install transformers torch
+   ```
+2. Clone this repository and navigate to the project folder:
+   ```bash
+   git clone <repository-url>
+   cd <project-folder>
+   ```
+## Configuration:
+The model's behavior can be customized using the following configuration options:
+- **`freeze_bert`**: If `True`, the BERT model's layers will be frozen according to the specified settings.
+- **`freeze_n_layers`**: An integer that defines the number of layers to freeze.
+- **`freeze_from_start`**: If `True`, freeze the first `n` layers from the start; if `False`, freeze the last `n` layers from the end.
+- **`concat_layers`**: Number of BERT layers to concatenate for the final sequence output.
+- **`pooling`**: Type of pooling to apply. Options: `'last'`, `'mean'`, etc.
+Example usage for configuring the model:
+```python
+from transformers import BertTokenizer
+from modeling_bert_bilstm import BertBiLSTMForSequenceClassification, BertBiLSTMConfig
+# Configure the model
+config = BertBiLSTMConfig(
+    bert_model_name="bert-base-uncased",
+    freeze_bert=True,
+    freeze_n_layers=10,
+    freeze_from_start=False  # Freeze the last 10 layers
+)
+# Initialize the model
+model = BertBiLSTMForSequenceClassification(config)
+# Print model's freeze summary
+freeze_summary = model.get_freeze_summary()
+print(freeze_summary)
+```
+## Training the Model:
+To train the model, you need to prepare your dataset and use standard PyTorch training loops. Here’s an outline of how you might train the model:
+```python
+from torch.utils.data import DataLoader
+from transformers import AdamW
+import torch
+# Create DataLoader, model, optimizer, etc.
+train_dataloader = DataLoader(train_dataset, batch_size=32, shuffle=True)
+optimizer = AdamW(model.parameters(), lr=1e-5)
+for epoch in range(num_epochs):
+    model.train()
+    for batch in train_dataloader:
+        input_ids = batch["input_ids"]
+        attention_mask = batch["attention_mask"]
+        labels = batch["labels"]
+        optimizer.zero_grad()
+        output = model(input_ids=input_ids, attention_mask=attention_mask, labels=labels)
+        loss = output["loss"]
+        loss.backward()
+        optimizer.step()
+```
+## Inference (Prediction without Labels):
+For serving the model in production, the model can be used for inference without needing labels.
+### Example Forward Pass for Inference:
+```python
+import torch
+# Example input (input_ids, attention_mask)
+input_ids = torch.tensor([[101, 2054, 2003, 102]])  # Example tokenized input
+attention_mask = torch.tensor([[1, 1, 1, 1]])       # Example attention mask
+# Get logits for prediction (no labels required)
+logits = model(input_ids=input_ids, attention_mask=attention_mask)
+print(logits)
+```
+### Logging:
+This model includes logging to help with debugging and monitoring during training and inference. Logs include information such as:
+- Initialization of the BERT model.
+- Freezing layers.
+- Inference start and completion.
+To configure logging:
+```python
+import logging
+# Set up logging
+logging.basicConfig(level=logging.INFO,
+                    format='%(asctime)s - %(levelname)s - %(message)s',
+                    handlers=[logging.StreamHandler()])
+logger = logging.getLogger(__name__)
+# Example log messages
+logger.info("Model initialized with BERT model: %s", config.bert_model_name)
+logger.info(f"Freezing the top {config.freeze_n_layers} layers of BERT.")
+```
+## Model Freezing Configuration:
+You can customize which layers of BERT to freeze. The `freeze_n_layers` parameter allows you to freeze a specific number of layers either from the start or the end of the BERT model:
+- **`freeze_from_start=True`**: Freeze the first `n` layers.
+- **`freeze_from_start=False`**: Freeze the last `n` layers.
+### Example of Freezing Layers:
+```python
+config = BertBiLSTMConfig(
+    freeze_bert=True,
+    freeze_n_layers=10,  # Freeze the last 10 layers
+    freeze_from_start=False  # Freeze from the end
+)
+```
+## Model Summary:
+You can view a summary of which layers are frozen and which are trainable by using the `get_freeze_summary()` method:
+```python
+freeze_summary = model.get_freeze_summary()
+print(freeze_summary)
+```
+Example output:
+```python
+[
+  {"layer": "bert.encoder.layer.0", "trainable": False},
+  {"layer": "bert.encoder.layer.1", "trainable": False},
+  {"layer": "bert.encoder.layer.2", "trainable": True},
+  {"layer": "bert.encoder.layer.3", "trainable": True},
+  ...
+]
+```
+## Notes:
+- This model is production-ready for serving via APIs like **FastAPI** or **Flask** for real-time predictions.
+- Make sure to handle logging and exception management properly in production.
+## License:
+This repository is licensed under the MIT License. See the LICENSE file for more information.