Dual-View SLaVA-CXR
Dual-View SLaVA-CXR is a vision-language model for structured radiology report generation from frontal and lateral chest X-rays. Built on the ReΒ³ (RecognizeβReasonβReport) paradigm and extending the original SLaVA-CXR model, this project integrates dual-view vision fusion and leverages CLIP, BiomedCLIP, and Phi-2 for enhanced anatomical reasoning.
π Directory Structure
βββ Data Collection and Preprocessing/
β βββ Data_collection_Mimic.ipynb
β βββ Data_preprocess.ipynb
β βββ Radgraph Based Report Cleaning.ipynb
β βββ train_data_json_gen.ipynb
β
βββ Evaluate/
β βββ Evaluate.ipynb
β βββ Results_IU_Xray/ # Contains evaluation results on IU X-ray dataset
β
βββ llava_phi/
β βββ Dual Slava train.ipynb # Training pipeline
β βββ generation.ipynb # Inference/report generation
β
βββ requirements.txt
βββ README.md
π§ Key Contributions
Dual-Encoder Fusion: Combines CLIP and BiomedCLIP for each view with learnable weight Ξ±:
Cross-View Attention: Enables anatomical reasoning across views:
Gated Feature Fusion:
ReΒ³ Pipeline:
- Recognize: Generate Findings from images
- Reason: Infer Impression from Findings
- Report: Output structured radiology reports
π Evaluation Metrics
Dataset | BLEU | ROUGE-L | METEOR | BERT | RadGraph F1 | CheXbert F1 |
---|---|---|---|---|---|---|
MIMIC-CXR | β | β | β | β | β | β |
IU X-Ray | β | β | β | β | β | β |
(Results in /Evaluate/Results_IU_Xray
)
π οΈ Setup
# Clone repo
git clone https://github.com/Clintonkjkj/Dual-View-Slava-CXR.git
cd Dual-View-Slava-CXR
# Set up virtual environment
python -m venv venv
source venv/bin/activate # or venv\Scripts\activate
# Install dependencies
pip install -r requirements.txt
π Usage
Download the model
Huggingface - https://huggingface.co/CKJ26/Dual-View-Slava-Final
ποΈ Train the Model
Use llava_phi/Dual Slava train.ipynb
after preparing data using:
Data_collection_Mimic.ipynb
Data_preprocess.ipynb
Radgraph Based Report Cleaning.ipynb
train_data_json_gen.ipynb
π Generate Reports
Use llava_phi/generation.ipynb
with both frontal and lateral views, plus a prompt (e.g., "Generate a radiology report").
πΌοΈ Model Architecture
π Citation
@misc{dualviewslava2025,
title={Dual View SLaVA-CXR: Structured Radiology Reporting via Multi-View Chest X-rays},
author={Clinton KJ et al.},
year={2025},
note={Capstone Project}
}
π§βπ» Author
- Clinton KJ β Hugging Face Profile
π License
This repository is provided for academic research purposes only.
- Downloads last month
- 10
Model tree for CKJ26/Dual-View-Slava-Final
Base model
microsoft/phi-2