NoMAISI: Nodule-Oriented Medical AI for Synthetic Imaging

PiNS Logo

Nodule-Oriented Medical AI for Synthetic Imaging and Augmentation in Chest CT

Abstract

Medical imaging datasets are increasingly available, yet abnormal and annotation-intensive cases such as lung nodules remain underrepresented. We Introduced NoMAISI (Nodule-Oriented Medical AI for Synthetic Imaging), a generative framework built on foundational backbones with flow-based diffusion and ControlNet conditioning. Using NoMAISI, we curated a large multi-cohort lung nodule dataset and applied context-aware nodule volume augmentation, including relocation, shrinkage to simulate early-stage disease, and expansion to model progression. Each case was rendered into multiple synthetic variants, producing a diverse and anatomically consistent dataset. Fidelity was evaluated with cross-cohort similarity metrics, and downstream integration into lung nodule detection, and classification tasks demonstrated improved external test performance, particularly in underrepresented lesion categories. These results show that nodule-oriented synthetic imaging and curated augmentation can complement clinical data, reduce annotation demands, and expand the availability of training resources for healthcare AI.

🧩 Workflow Overview

The overall pipeline for organ, body, and nodule segmentation with alignment is shown below:

Segmentation Pipeline

Workflow for constructing the NoMAISI development dataset. The pipeline includes (1) organ segmentation using AI models, (2) body segmentation with algorithmic methods, (3) nodule segmentation through AI-assisted and ML-based refinement, and (4) segmentation alignment to integrate organs, body, and nodules segmentations into anatomically consistent volumes.

NoMAISI_train_and_infer

Overview of our flow-based latent diffusion model with ControlNet conditioning for AI-based CT generation. The pipeline consists of three stages: (top) Pretrained VAE for image compression, where CT images are encoded into latent features using a frozen VAE; (middle) Model fine-tuning, where a Rectified Flow ODE sampler, conditioned on segmentation masks and voxel spacing through a fine-tuned ControlNet, predicts velocity fields in latent space and is optimized with a region-specific contrastive loss emphasizing ROI sensitivity and background consistency; and (bottom) Inference, where segmentation masks and voxel spacing guide latent sampling along the ODE trajectory to obtain a clean latent representation, which is then decoded by the VAE into full-resolution AI-generated CT images conditioned by body and lesion masks.

📊 Dataset Composition

The table below summarizes the datasets included in this project, with their split sizes (Patients, CT scans, and Nodules) and the annotation types available.

Dataset	Patients n (%)	CT Scans n (%)	Nodules n (%)	Organ Seg	Nodule Seg	Nodule CCC	Nodule Box
LNDbv4	223 (3.17)	223 (2.52)	1132 (7.84)	✗	✓	✗	✓
NSCLC-R	415 (5.89)	415 (4.69)	415 (2.87)	✗	✓	✗	✓
LIDC-IDRI	870 (12.35)	870 (9.84)	2584 (17.89)	✗	✓	✓	✓
DLCS-24	1605 (22.79)	1605 (18.15)	2478 (17.16)	✗	✓	✗	✓
Intgmultiomics	1936 (27.49)	1936 (21.90)	1936 (13.40)	✗	✓	✗	✗
LUNA-25	1993 (28.30)	3792 (42.89)	5899 (40.84)	✗	✓	✗	✓
TOTAL	7042 (100)	8841 (100)	14444 (100)	—	—	—	—

Notes

Percentages indicate proportion relative to the total for each column.
✔︎ = annotation available, ✗ = annotation not available.
“Nodule CCC” = nodule center coordinates.
“Nodule Box” = bounding-box annotations.

📚 Dataset citations References

LNDbv4 : https://zenodo.org/records/8348419
NSCLC-Radiomics : https://www.cancerimagingarchive.net/collection/nsclc-radiogenomics/
LIDC-IDRI: https://ieee-dataport.org/documents/lung-image-database-consortium-image-collection-lidc-idri
DLCS24: https://zenodo.org/records/13799069
Intgmultiomics: M Zhao et. al, Nat.Commun(2025).
LUNA25: https://luna25.grand-challenge.org/

AI-Generated CT Evaluations

📉 Fréchet Inception Distance (FID) Results

Fréchet Inception Distance (FID) of the MAISI-v2 baseline and NoMAISI models with multiple public clinical datasets (test dataset) as the references (Lower is better).

FID (Avg.)	LNDbv4	NSCLC-R	LIDC-IDRI	DLCS-24	Intgmultiomics	LUNA-25
Real LNDbv4	—	5.13	1.49	1.05	2.40	1.98
Real NSCLC-R	5.13	—	3.12	3.66	1.56	2.65
Real LIDC-IDRI	1.49	3.12	—	0.79	1.44	0.75
Real DLCS-24	1.05	3.66	0.79	—	1.56	1.00
Real Intgmultiomics	2.40	1.56	1.44	1.56	—	1.57
Real LUNA-25	1.98	2.65	0.75	1.00	1.57	—
AI-Generated MAISI-V2	3.15	5.21	2.70	2.32	2.82	1.69
AI-Generated NoMAISI (ours)	2.99	3.05	2.31	2.27	2.62	1.18

📉 FID Parity Plot

Parity comparison of FID for real↔real vs AI-generated CT across datasets

Comparison of Fréchet Inception Distance (FID) between real↔real and AI-generated CT datasets. Each point represents a clinical dataset (LNDbv4, NSCLC-R, LIDC-IDRI, DLCS24, Intgmultiomics, LUNA25) under different generative models (MAISI-V2, NoMAISI).The x-axis shows the median FID computed between real datasets, while the y-axis shows the FID of AI-generated data compared to real.
The dashed diagonal line denotes parity (y = x), where AI-generated fidelity would match real↔real fidelity.

🖼️ Example Results

Comparison of CT generation from anatomical masks.

Left: Input organ/body segmentation mask.
Middle: Generated CT slice using MAISI-V2.
Right: Generated CT slice using NoMAISI (ours).
Yellow boxes highlight lung nodule regions for comparison.

Comparison of MAISI-V2 vs NoMAISI on lung CT with input masks

Inference Guide

Project Structure
Configuration Files

Model Weights

Model weights are available upon request. Please email the authors: [email protected].

📁 Project Structure

NoMAISI/
├── configs/                          # Configuration files
│   ├── config_maisi3d-rflow.json    # Main model configuration
│   ├── infr_env_NoMAISI_DLCSD24_demo.json  # Environment settings
│   └── infr_config_NoMAISI_controlnet.json # ControlNet inference config
├── scripts/                          # Python inference scripts
│   ├── infer_testV2_controlnet.py   # Main inference script
│   ├── infer_controlnet.py          # ControlNet inference
│   └── utils.py                     # Utility functions
├── models/                           # Pre-trained model weights
├── data/                            # Input data directory
├── outputs/                         # Generated results
├── logs/                           # Execution logs
└── inference.sub                   # SLURM job script

⚙️ Configuration Files

1. Main Model Configuration (`config_maisi3d-rflow.json`): Controls the core diffusion model parameters:

Model architecture settings; Sampling parameters; Image dimensions and spacing

2. Environment Configuration (`infr_env_NoMAISI_DLCSD24_demo.json`): Defines runtime environment

Data paths and directories; GPU settings; Memory allocation

3. ControlNet Configuration (`infr_config_NoMAISI_controlnet.json`): ControlNet-specific settings

Conditioning parameters; Generation controls; Output specifications

🚀 Running Inference

cd /path/NoMAISI/
# Create logs directory if it doesn't exist
mkdir -p logs
# Submit job to SLURM
sbatch inference.sub

# Run inference directly
cd /path/NoMAISI/
python -m scripts.infer_testV2_controlnet \
    -c ./configs/config_maisi3d-rflow.json \
    -e ./configs/infr_env_NoMAISI_DLCSD24_demo.json \
    -t ./configs/infr_config_NoMAISI_controlnet.json

Downstream Task:

Cancer vs. No-Cancer Classification
Nodule Detection .
Nodule Segmentation .

🔬 Downstream Task: Cancer vs. No-Cancer Classification

Shown. AUC vs. the % of clinical data retained (x-axis: 100%, 50%, 20%, 10%). Curves (additive augmentation — we add AI-generated nodules; we never replace clinical samples):

Clinical (LUNA25) — baseline using only the retained clinical data.
Clinical + AI-gen. (n%) — at each point, add AI-generated data equal to the same percentage as the retained clinical fraction.
Examples: at 50% clinical → +50% AI-gen; 20% → +20%; 10% → +10%.
Clinical + AI-gen. (100%) — at each point, add AI-generated data equal to 100% of the full clinical dataset size, regardless of the retained fraction.
Example: at 10% clinical → +100% AI-gen.

Takeaways

AI-generated nodules improve data-efficiency: at low clinical fractions (50%→10%), Clinical + AI-gen. (n%) typically matches or exceeds clinical-only AUC.
Bigger synthetic boosts (100%) can help in some regimes but may underperform the matched n% mix depending on cohort → ratio-balanced augmentation is often safer.
Trends generalize to external cohorts, indicating usability beyond the development data.

Acknowledgements

We gratefully acknowledge the open-source projects that directly informed this repository: the MAISI tutorial from the Project MONAI tutorials, the broader Project MONAI ecosystem, our related benchmark repo AI in Lung Health – Benchmarking, and our companion toolkits PiNS – Point-driven Nodule Segmentation and CaNA – Context-Aware Nodule Augmentation. We thank these communities and contributors for their exceptional open-source efforts. If you use our models or code, please also consider citing these works (alongside this repository) to acknowledge their contributions.

References

Downloads last month: -; Downloads are not tracked for this model. How to track

Model tree for ft42/NoMAISI

Base model

MONAI/maisi_ct_generative

Adapter

(1)

this model