Wilbur1240's picture
Update README.md
c9f7473 verified
metadata
library_name: transformers
tags:
  - vision
  - semantic-segmentation
  - segformer
  - sky
  - sea
  - obstacle
  - huggingface
license: apache-2.0
datasets:
  - Wilbur1240/MaSTr1325_512x384
base_model:
  - nvidia/segformer-b0-finetuned-ade-512-512

Segformer Fine-Tuned on Custom Sky/Sea/Obstacle Dataset

This model is a fine-tuned version of nvidia/segformer-b0-finetuned-ade-512-512 on a custom dataset with 3 semantic classes:

  • Sky
  • Sea
  • Obstacle

It is intended for use in vision-based autonomous surface navigation and maritime scene understanding.


Model Details

Model Description

  • Base architecture: SegFormer-B0
  • Pretrained on: ADE20K dataset
  • Fine-tuned for: Semantic segmentation on maritime images
  • Number of classes: 3
  • Ignore index: 255
  • Resolution: 512×512 input images
  • Training precision: fp32
  • Framework: PyTorch with 🤗 Transformers

Model Sources


Usage

from transformers import AutoModelForSemanticSegmentation, AutoImageProcessor
from PIL import Image
import torch

# Load model and processor
model = AutoModelForSemanticSegmentation.from_pretrained("Wilbur1240/segformer-b0-finetuned-ade-512-512-finetune-mastr1325")
processor = AutoImageProcessor.from_pretrained("Wilbur1240/segformer-b0-finetuned-ade-512-512-finetune-mastr1325")

# Load and preprocess an image
image = Image.open("example.jpg").convert("RGB")
inputs = processor(images=image, return_tensors="pt")

# Inference
with torch.no_grad():
    outputs = model(**inputs)
    logits = outputs.logits  # [1, num_classes, H, W]
    pred_seg = logits.argmax(dim=1)  # [1, H, W]