File size: 4,301 Bytes
3bfc74f 0e0f2ac 3bfc74f 0e0f2ac 3bfc74f e545d00 0e0f2ac e545d00 0e0f2ac 3de0a9d 0e0f2ac 9d7550a 3bfc74f 9d7550a 3bfc74f 9d7550a 3bfc74f 9d7550a 3bfc74f 9d7550a 3bfc74f 9d7550a 3bfc74f |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 |
---
license: mit
tags:
- vision-transformer
- spectrogram-analysis
- lora
- pytorch
- regression
---
# Vision Transformer (ViT) with LoRA for Spectrogram Regression
<div style="display: flex; flex-wrap: wrap; gap: 15px; margin-top: 15px;">
<div style="flex: 1; min-width: 200px; background: white; border-radius: 8px; padding: 15px; box-shadow: 0 2px 4px rgba(0,0,0,0.1);">
<h4 style="margin-top: 0; color: #5f6368;">π§βπ» Curated by</h4>
<p>Nooshin Bahador</p>
</div>
<div style="flex: 1; min-width: 200px; background: white; border-radius: 8px; padding: 15px; box-shadow: 0 2px 4px rgba(0,0,0,0.1);">
<h4 style="margin-top: 0; color: #5f6368;">π° Funded by</h4>
<p>Canadian Neuroanalytics Scholars Program</p>
</div>
<div style="flex: 1; min-width: 200px; background: white; border-radius: 8px; padding: 15px; box-shadow: 0 2px 4px rgba(0,0,0,0.1);">
<h4 style="margin-top: 0; color: #5f6368;">π License</h4>
<p>MIT</p>
</div>
</div>
## Model Description
This is a Vision Transformer (ViT) model fine-tuned using Low-Rank Adaptation (LoRA) for regression tasks on spectrogram data. The model predicts three key parameters of chirp signals:
1. Chirp start time (s)
2. Start frequency (Hz)
3. End frequency (Hz)
<div style="background: #f8f9fa; border-radius: 8px; padding: 20px; margin-bottom: 20px; border-left: 4px solid #4285f4;">
<h2 style="margin-top: 0;">π§ Fine-Tuning Details</h2>
<div style="background: white; border-radius: 8px; padding: 15px; box-shadow: 0 2px 4px rgba(0,0,0,0.1);">
<ul>
<li><strong>Framework:</strong> PyTorch</li>
<li><strong>Architecture:</strong> Pre-trained Vision Transformer (ViT)</li>
<li><strong>Adaptation Method:</strong> LoRA (Low-Rank Adaptation)</li>
<li><strong>Task:</strong> Regression on time-frequency representations</li>
<li><strong>Training Protocol:</strong> Automatic Mixed Precision (AMP), Early stopping, Learning Rate scheduling</li>
<li><strong>Output:</strong> Quantitative predictions + optional natural language descriptions</li>
</ul>
</div>
</div>
<div style="background: #f8f9fa; border-radius: 8px; padding: 20px; margin-bottom: 20px; border-left: 4px solid #34a853;">
<h2 style="margin-top: 0;">π¦ Resources</h2>
<div style="display: flex; flex-wrap: wrap; gap: 15px;">
<div style="flex: 1; min-width: 250px; background: white; border-radius: 8px; padding: 15px; box-shadow: 0 2px 4px rgba(0,0,0,0.1);">
<h4 style="margin-top: 0;">Trained Model</h4>
<p><a href="https://huggingface.co/nubahador/Fine_Tuned_Transformer_Model_for_Chirp_Localization/tree/main">HuggingFace Model Hub</a></p>
</div>
<div style="flex: 1; min-width: 250px; background: white; border-radius: 8px; padding: 15px; box-shadow: 0 2px 4px rgba(0,0,0,0.1);">
<h4 style="margin-top: 0;">Spectrogram Dataset</h4>
<p><a href="https://huggingface.co/datasets/nubahador/ChirpLoc100K___A_Synthetic_Spectrogram_Dataset_for_Chirp_Localization/tree/main">HuggingFace Dataset Hub</a></p>
</div>
<div style="flex: 1; min-width: 250px; background: white; border-radius: 8px; padding: 15px; box-shadow: 0 2px 4px rgba(0,0,0,0.1);">
<h4 style="margin-top: 0;">PyTorch Implementation</h4>
<p><a href="https://github.com/nbahador/Train_Spectrogram_Transformer">GitHub Repository</a></p>
</div>
<div style="flex: 1; min-width: 250px; background: white; border-radius: 8px; padding: 15px; box-shadow: 0 2px 4px rgba(0,0,0,0.1);">
<h4 style="margin-top: 0;">Chirp Generator</h4>
<p><a href="https://github.com/nbahador/chirp_spectrogram_generator">GitHub Package</a></p>
</div>
</div>
</div>
<div style="background: #f8f9fa; border-radius: 8px; padding: 20px; border-left: 4px solid #ea4335;">
<h2 style="margin-top: 0;">π Citation</h2>
<div style="background: white; border-radius: 8px; padding: 15px; box-shadow: 0 2px 4px rgba(0,0,0,0.1);">
<p>If you use this model in your research, please cite:</p>
<p>Bahador, N., & Lankarany, M. (2025). Chirp localization via fine-tuned transformer model: A proof-of-concept study. arXiv preprint arXiv:2503.22713. <a href="https://arxiv.org/pdf/2503.22713">[PDF]</a></p>
</div>
</div> |