AventIQ-AI
/

Bert-Disaster-SOS-Message-Classifier

Safetensors

bert

Model card Files Files and versions Community

developerPushkal commited on Mar 19

Commit

f5381b5

verified ·

1 Parent(s): 871a44f

Create README.md

Browse files

Files changed (1) hide show

README.md +104 -0

README.md ADDED Viewed

	@@ -0,0 +1,104 @@

+ ### **BERT-Base-Uncased Quantized Model for Disaster SOS Message Classification**
+This repository hosts a quantized version of the BERT model, fine-tuned for **Disaster SOS Message Classification**. The model efficiently classifies emergency messages related to disasters, helping prioritize urgent cases. It has been optimized for deployment in resource-constrained environments while maintaining high accuracy.
+## **Model Details**
+- **Model Architecture:** BERT Base Uncased
+- **Task:** Disaster SOS Message Classification
+- **Dataset:** Disaster Response Messages Dataset
+- **Quantization:** Float16
+- **Fine-tuning Framework:** Hugging Face Transformers
+## **Usage**
+### **Installation**
+```sh
+pip install transformers torch
+```
+### **Loading the Model**
+```python
+from transformers import BertForSequenceClassification, BertTokenizer
+import torch
+# Load quantized model
+quantized_model_path = "/kaggle/working/bert_finetuned_fp16"
+quantized_model = BertForSequenceClassification.from_pretrained(quantized_model_path)
+quantized_model.eval()  # Set to evaluation mode
+quantized_model.half()  # Convert model to FP16
+# Load tokenizer
+tokenizer = BertTokenizer.from_pretrained("bert-base-uncased")
+# Define a test SOS message
+test_message = "There is a massive earthquake, and people need help immediately!"
+# Tokenize input
+inputs = tokenizer(test_message, return_tensors="pt", padding=True, truncation=True, max_length=128)
+# Ensure input tensors are in correct dtype
+inputs["input_ids"] = inputs["input_ids"].long()
+inputs["attention_mask"] = inputs["attention_mask"].long()
+# Make prediction
+with torch.no_grad():
+    outputs = quantized_model(**inputs)
+# Get predicted categories
+probs = torch.sigmoid(outputs.logits).cpu().numpy().flatten()
+predictions = (probs > 0.5).astype(int)
+# Category mapping (Example)
+category_names = ["Earthquake", "Flood", "Medical Emergency", "Infrastructure Damage", "General Help"]
+predicted_labels = [category_names[i] for i in range(len(predictions)) if predictions[i] == 1]
+print(f"Message: {test_message}")
+print(f"Predicted Categories: {predicted_labels}")
+print(f"Confidence Scores: {probs}")
+```
+## **Performance Metrics**
+- **Accuracy:** 0.85
+- **F1 Score:** 0.83
+## **Fine-Tuning Details**
+### **Dataset**
+The dataset is the **Disaster Response Messages Dataset**, which contains real-life messages from various disaster scenarios.
+### **Training**
+- Number of epochs: 3
+- Batch size: 8
+- Evaluation strategy: epoch
+- Learning rate: 2e-5
+### **Quantization**
+Post-training quantization was applied using PyTorch’s built-in quantization framework, reducing model size and improving inference speed.
+## **Repository Structure**
+```
+.
+├── model/               # Contains the quantized model files
+├── tokenizer_config/    # Tokenizer configuration and vocabulary files
+├── model.safensors/     # Fine-tuned Model
+├── README.md            # Model documentation
+```
+## **Limitations**
+- The model may not generalize well to unseen disaster types outside the training data.
+- Minor accuracy degradation due to quantization.
+## **Contributing**
+Contributions are welcome! Feel free to open an issue or submit a pull request if you have suggestions or improvements.
+---