Dual-View SLaVA-CXR

Dual-View SLaVA-CXR is a vision-language model for structured radiology report generation from frontal and lateral chest X-rays. Built on the ReΒ³ (Recognize–Reason–Report) paradigm and extending the original SLaVA-CXR model, this project integrates dual-view vision fusion and leverages CLIP, BiomedCLIP, and Phi-2 for enhanced anatomical reasoning.


πŸ“ Directory Structure

β”œβ”€β”€ Data Collection and Preprocessing/
β”‚   β”œβ”€β”€ Data_collection_Mimic.ipynb
β”‚   β”œβ”€β”€ Data_preprocess.ipynb
β”‚   β”œβ”€β”€ Radgraph Based Report Cleaning.ipynb
β”‚   └── train_data_json_gen.ipynb
β”‚
β”œβ”€β”€ Evaluate/
β”‚   β”œβ”€β”€ Evaluate.ipynb
β”‚   └── Results_IU_Xray/           # Contains evaluation results on IU X-ray dataset
β”‚
β”œβ”€β”€ llava_phi/
β”‚   β”œβ”€β”€ Dual Slava train.ipynb     # Training pipeline
β”‚   └── generation.ipynb           # Inference/report generation
β”‚
β”œβ”€β”€ requirements.txt
└── README.md

🧠 Key Contributions

  • Dual-Encoder Fusion: Combines CLIP and BiomedCLIP for each view with learnable weight Ξ±:

  • Cross-View Attention: Enables anatomical reasoning across views:

  • Gated Feature Fusion:

  • ReΒ³ Pipeline:

    1. Recognize: Generate Findings from images
    2. Reason: Infer Impression from Findings
    3. Report: Output structured radiology reports

πŸ“Š Evaluation Metrics

Dataset BLEU ROUGE-L METEOR BERT RadGraph F1 CheXbert F1
MIMIC-CXR βœ… βœ… βœ… βœ… βœ… βœ…
IU X-Ray βœ… βœ… βœ… βœ… βœ… βœ…

(Results in /Evaluate/Results_IU_Xray)


πŸ› οΈ Setup

# Clone repo
git clone https://github.com/Clintonkjkj/Dual-View-Slava-CXR.git
cd Dual-View-Slava-CXR

# Set up virtual environment
python -m venv venv
source venv/bin/activate  # or venv\Scripts\activate

# Install dependencies
pip install -r requirements.txt

πŸš€ Usage

Download the model

Huggingface - https://huggingface.co/CKJ26/Dual-View-Slava-Final

πŸ‹οΈ Train the Model

Use llava_phi/Dual Slava train.ipynb after preparing data using:

  • Data_collection_Mimic.ipynb
  • Data_preprocess.ipynb
  • Radgraph Based Report Cleaning.ipynb
  • train_data_json_gen.ipynb

πŸ“„ Generate Reports

Use llava_phi/generation.ipynb with both frontal and lateral views, plus a prompt (e.g., "Generate a radiology report").


πŸ–ΌοΈ Model Architecture

Architecture


πŸ“š Citation

@misc{dualviewslava2025,
  title={Dual View SLaVA-CXR: Structured Radiology Reporting via Multi-View Chest X-rays},
  author={Clinton KJ et al.},
  year={2025},
  note={Capstone Project}
}

πŸ§‘β€πŸ’» Author


πŸ“œ License

This repository is provided for academic research purposes only.

Downloads last month
10
Safetensors
Model size
2.97B params
Tensor type
F32
Β·
BF16
Β·
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for CKJ26/Dual-View-Slava-Final

Base model

microsoft/phi-2
Finetuned
(347)
this model