# Model Card: Fine-Tuned MobileNetV2 for Skin Lesion Classification # Model Details **Model Name:** Fine-Tuned MobileNetV2 for Skin Lesion Classification **Base Model:** google/mobilenet_v2_1.0_224 (pretrained on ImageNet) **Dataset:** marmal88/skin_cancer **Quantization:** Available as an optional FP16 version for optimized inference **Training Device:** CUDA (GPU, 12 GB) # Dataset Information ``` Dataset Structure DatasetDict({ train: Dataset({ features: ['image', 'image_id', 'lesion_id', 'dx', 'dx_type', 'age', 'sex', 'localization'], num_rows: 9577 }) validation: Dataset({ features: ['image', 'image_id', 'lesion_id', 'dx', 'dx_type', 'age', 'sex', 'localization'], num_rows: 2492 }) test: Dataset({ features: ['image', 'image_id', 'lesion_id', 'dx', 'dx_type', 'age', 'sex', 'localization'], num_rows: 1285 }) }) Available Splits Train: 9,577 examples Validation: 2,492 examples Test: 1,285 examples ``` # Feature Representation - image: RGB image (originally 600x450, resized to 224x224 during preprocessing) - image_id: Unique identifier (e.g., ISIC_0024329) - lesion_id: Lesion identifier (e.g., HAM_0002954) - dx: Diagnosis label (e.g., melanoma, actinic_keratoses) - dx_type: Diagnosis method (e.g., histo for histopathology) - age: Patient age (float, e.g., 75.0) - sex: Patient gender (e.g., female) - localization: Body location (e.g., lower extremity) # Note: Only image and dx (converted to integer labels) were used for training; other features were dropped during preprocessing. # Training Details - Number of Classes: 7 - Class Names: actinic_keratoses, basal_cell_carcinoma, benign_keratosis-like_lesions, dermatofibroma, melanocytic_Nevi, melanoma, vascular_lesions - Training Process: Fine-tuned for 5 epochs (initially planned for 10, reduced to 5) - Learning Rate: 0.001 (Adam optimizer, fine-tuning all layers) - Batch Size: 32 (suitable for a 12 GB GPU) - Preprocessing: Images resized to 224x224, normalized with mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225] (ImageNet stats) - Performance Metrics - Epochs: 5 - Training Loss: [To be filled after training output] - Validation Loss: [To be filled after training output] - Accuracy: [To be filled after training output] - F1 Score: Not computed (can be added with additional evaluation) # Note: Performance metrics depend on your training output. Please provide the training log (loss/accuracy per epoch) to complete this section. Inference Example ```python import torch from torchvision import transforms from PIL import Image import torch.nn.functional as F import json # Preprocessing (matches training) preprocess = transforms.Compose([ transforms.Resize((224, 224)), transforms.ToTensor(), transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]) ]) # Load model and labels device = torch.device("cuda" if torch.cuda.is_available() else "cpu") model_path = "skin_cancer_model_fp16/mobilenetv2_skin_cancer_fp16.pt" model = torch.load(model_path, map_location=device) model = model.to(device) model.eval() with open("skin_cancer_model_fp16/labels.json", 'r') as f: label_mapping = json.load(f) class_names = list(label_mapping.keys()) # Inference function def predict_image(image_path, model, preprocess, device, class_names): image = Image.open(image_path).convert('RGB') image_tensor = preprocess(image).unsqueeze(0).half() # FP16 image_tensor = image_tensor.to(device) with torch.no_grad(): outputs = model(image_tensor) probabilities = F.softmax(outputs, dim=1) confidence, predicted = torch.max(probabilities, 1) predicted_class = class_names[predicted.item()] confidence_score = confidence.item() * 100 return predicted_class, confidence_score # Example usage if __name__ == "__main__": image_path = "C:/path/to/your/image.jpg" # Replace with your image path predicted_class, confidence = predict_image(image_path, model, preprocess, device, class_names) print(f"Predicted Class: {predicted_class}") print(f"Confidence: {confidence:.2f}%") ``` # Quantization & Optimization **Quantization:** Optional FP16 version created using PyTorch’s .half() for faster inference and reduced memory footprint (~50% size reduction). **Optimized:** Suitable for deployment on GPU-enabled devices (e.g., CUDA with 12 GB VRAM). # Usage - Input: RGB images (any size, resized to 224x224 during preprocessing) - Output: Predicted skin lesion class (one of 7) with confidence probability # Limitations - Generalization: Trained on the marmal88/skin_cancer dataset, which may not fully represent all real-world skin lesions (e.g., varying lighting, angles, or skin types). - Dataset Bias: Performance may vary depending on dataset diversity (e.g., age, sex, localization not used in training). - Accuracy: Limited to 5 epochs; further training or larger datasets might improve results. # Future Improvements - Data Augmentation: Add rotation, flipping, or color jittering to enhance robustness. - Larger Dataset: Incorporate additional skin cancer datasets (e.g., ISIC Archive) for better coverage. - Model Tuning: Experiment with freezing feature layers or adjusting learning rate for better convergence.