arxiv:2508.15930

Semantic-Aware Ship Detection with Vision-Language Integration

Published on Aug 21

Authors:

Jiancheng Pan ,

Abstract

A novel ship detection framework combining Vision-Language Models and a multi-scale adaptive sliding window strategy improves fine-grained semantic information capture using a specialized ShipSem-VL dataset.

AI-generated summary

Ship detection in remote sensing imagery is a critical task with wide-ranging applications, such as maritime activity monitoring, shipping logistics, and environmental studies. However, existing methods often struggle to capture fine-grained semantic information, limiting their effectiveness in complex scenarios. To address these challenges, we propose a novel detection framework that combines Vision-Language Models (VLMs) with a multi-scale adaptive sliding window strategy. To facilitate Semantic-Aware Ship Detection (SASD), we introduce ShipSem-VL, a specialized Vision-Language dataset designed to capture fine-grained ship attributes. We evaluate our framework through three well-defined tasks, providing a comprehensive analysis of its performance and demonstrating its effectiveness in advancing SASD from multiple perspectives.

View arXiv page View PDF Add to collection

Community

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment

Upvote

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2508.15930 in a model README.md to link it from this page.

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2508.15930 in a dataset README.md to link it from this page.

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2508.15930 in a Space README.md to link it from this page.

Collections including this paper 0

No Collection including this paper

Add this paper to a collection to link it from this page.