NoMAISI: Nodule-Oriented Medical AI for Synthetic Imaging
Abstract
Medical imaging datasets are increasingly available, yet abnormal and annotation-intensive cases such as lung nodules remain underrepresented. We Introduced NoMAISI (Nodule-Oriented Medical AI for Synthetic Imaging), a generative framework built on foundational backbones with flow-based diffusion and ControlNet conditioning. Using NoMAISI, we curated a large multi-cohort lung nodule dataset and applied context-aware nodule volume augmentation, including relocation, shrinkage to simulate early-stage disease, and expansion to model progression. Each case was rendered into multiple synthetic variants, producing a diverse and anatomically consistent dataset. Fidelity was evaluated with cross-cohort similarity metrics, and downstream integration into lung nodule detection, and classification tasks demonstrated improved external test performance, particularly in underrepresented lesion categories. These results show that nodule-oriented synthetic imaging and curated augmentation can complement clinical data, reduce annotation demands, and expand the availability of training resources for healthcare AI.
π§© Workflow Overview
The overall pipeline for organ, body, and nodule segmentation with alignment is shown below:
Workflow for constructing the NoMAISI development dataset. The pipeline includes (1) organ segmentation using AI models, (2) body segmentation with algorithmic methods, (3) nodule segmentation through AI-assisted and ML-based refinement, and (4) segmentation alignment to integrate organs, body, and nodules segmentations into anatomically consistent volumes.
Overview of our flow-based latent diffusion model with ControlNet conditioning for AI-based CT generation. The pipeline consists of three stages: (top) Pretrained VAE for image compression, where CT images are encoded into latent features using a frozen VAE; (middle) Model fine-tuning, where a Rectified Flow ODE sampler, conditioned on segmentation masks and voxel spacing through a fine-tuned ControlNet, predicts velocity fields in latent space and is optimized with a region-specific contrastive loss emphasizing ROI sensitivity and background consistency; and (bottom) Inference, where segmentation masks and voxel spacing guide latent sampling along the ODE trajectory to obtain a clean latent representation, which is then decoded by the VAE into full-resolution AI-generated CT images conditioned by body and lesion masks.
π Dataset Composition
The table below summarizes the datasets included in this project, with their split sizes (Patients, CT scans, and Nodules) and the annotation types available.
Dataset | Patients n (%) |
CT Scans n (%) |
Nodules n (%) |
Organ Seg | Nodule Seg | Nodule CCC | Nodule Box |
---|---|---|---|---|---|---|---|
LNDbv4 | 223 (3.17) | 223 (2.52) | 1132 (7.84) | β | β | β | β |
NSCLC-R | 415 (5.89) | 415 (4.69) | 415 (2.87) | β | β | β | β |
LIDC-IDRI | 870 (12.35) | 870 (9.84) | 2584 (17.89) | β | β | β | β |
DLCS-24 | 1605 (22.79) | 1605 (18.15) | 2478 (17.16) | β | β | β | β |
Intgmultiomics | 1936 (27.49) | 1936 (21.90) | 1936 (13.40) | β | β | β | β |
LUNA-25 | 1993 (28.30) | 3792 (42.89) | 5899 (40.84) | β | β | β | β |
TOTAL | 7042 (100) | 8841 (100) | 14444 (100) | β | β | β | β |
Notes
- Percentages indicate proportion relative to the total for each column.
- βοΈ = annotation available, β = annotation not available.
- βNodule CCCβ = nodule center coordinates.
- βNodule Boxβ = bounding-box annotations.
π Dataset citations References
- LNDbv4 : https://zenodo.org/records/8348419
- NSCLC-Radiomics : https://www.cancerimagingarchive.net/collection/nsclc-radiogenomics/
- LIDC-IDRI: https://ieee-dataport.org/documents/lung-image-database-consortium-image-collection-lidc-idri
- DLCS24: https://zenodo.org/records/13799069
- Intgmultiomics: M Zhao et. al, Nat.Commun(2025).
- LUNA25: https://luna25.grand-challenge.org/
AI-Generated CT Evaluations
π FrΓ©chet Inception Distance (FID) Results
FrΓ©chet Inception Distance (FID) of the MAISI-v2 baseline and NoMAISI models with multiple public clinical datasets (test dataset) as the references (Lower is better).
FID (Avg.) | LNDbv4 | NSCLC-R | LIDC-IDRI | DLCS-24 | Intgmultiomics | LUNA-25 |
---|---|---|---|---|---|---|
Real LNDbv4 | β | 5.13 | 1.49 | 1.05 | 2.40 | 1.98 |
Real NSCLC-R | 5.13 | β | 3.12 | 3.66 | 1.56 | 2.65 |
Real LIDC-IDRI | 1.49 | 3.12 | β | 0.79 | 1.44 | 0.75 |
Real DLCS-24 | 1.05 | 3.66 | 0.79 | β | 1.56 | 1.00 |
Real Intgmultiomics | 2.40 | 1.56 | 1.44 | 1.56 | β | 1.57 |
Real LUNA-25 | 1.98 | 2.65 | 0.75 | 1.00 | 1.57 | β |
AI-Generated MAISI-V2 | 3.15 | 5.21 | 2.70 | 2.32 | 2.82 | 1.69 |
AI-Generated NoMAISI (ours) | 2.99 | 3.05 | 2.31 | 2.27 | 2.62 | 1.18 |
π FID Parity Plot
Comparison of FrΓ©chet Inception Distance (FID) between realβreal and AI-generated CT datasets. Each point represents a clinical dataset (LNDbv4, NSCLC-R, LIDC-IDRI, DLCS24, Intgmultiomics, LUNA25) under different generative models (MAISI-V2, NoMAISI).The x-axis shows the median FID computed between real datasets, while the y-axis shows the FID of AI-generated data compared to real.
The dashed diagonal line denotes parity (y = x), where AI-generated fidelity would match realβreal fidelity.
πΌοΈ Example Results
Comparison of CT generation from anatomical masks.
- Left: Input organ/body segmentation mask.
- Middle: Generated CT slice using MAISI-V2.
- Right: Generated CT slice using NoMAISI (ours).
- Yellow boxes highlight lung nodule regions for comparison.
Inference Guide
Model Weights
Model weights are available upon request. Please email the authors: [email protected].
π Project Structure
NoMAISI/
βββ configs/ # Configuration files
β βββ config_maisi3d-rflow.json # Main model configuration
β βββ infr_env_NoMAISI_DLCSD24_demo.json # Environment settings
β βββ infr_config_NoMAISI_controlnet.json # ControlNet inference config
βββ scripts/ # Python inference scripts
β βββ infer_testV2_controlnet.py # Main inference script
β βββ infer_controlnet.py # ControlNet inference
β βββ utils.py # Utility functions
βββ models/ # Pre-trained model weights
βββ data/ # Input data directory
βββ outputs/ # Generated results
βββ logs/ # Execution logs
βββ inference.sub # SLURM job script
βοΈ Configuration Files
1. Main Model Configuration (config_maisi3d-rflow.json
): Controls the core diffusion model parameters:
- Model architecture settings; Sampling parameters; Image dimensions and spacing
2. Environment Configuration (infr_env_NoMAISI_DLCSD24_demo.json
): Defines runtime environment
- Data paths and directories; GPU settings; Memory allocation
3. ControlNet Configuration (infr_config_NoMAISI_controlnet.json
): ControlNet-specific settings
- Conditioning parameters; Generation controls; Output specifications
π Running Inference
cd /path/NoMAISI/
# Create logs directory if it doesn't exist
mkdir -p logs
# Submit job to SLURM
sbatch inference.sub
# Run inference directly
cd /path/NoMAISI/
python -m scripts.infer_testV2_controlnet \
-c ./configs/config_maisi3d-rflow.json \
-e ./configs/infr_env_NoMAISI_DLCSD24_demo.json \
-t ./configs/infr_config_NoMAISI_controlnet.json
Downstream Task:
π¬ Downstream Task: Cancer vs. No-Cancer Classification
Shown. AUC vs. the % of clinical data retained (x-axis: 100%, 50%, 20%, 10%). Curves (additive augmentation β we add AI-generated nodules; we never replace clinical samples):
- Clinical (LUNA25) β baseline using only the retained clinical data.
- Clinical + AI-gen. (n%) β at each point, add AI-generated data equal to the same percentage as the retained clinical fraction.
Examples: at 50% clinical β +50% AI-gen; 20% β +20%; 10% β +10%. - Clinical + AI-gen. (100%) β at each point, add AI-generated data equal to 100% of the full clinical dataset size, regardless of the retained fraction.
Example: at 10% clinical β +100% AI-gen.
Takeaways
- AI-generated nodules improve data-efficiency: at low clinical fractions (50%β10%), Clinical + AI-gen. (n%) typically matches or exceeds clinical-only AUC.
- Bigger synthetic boosts (100%) can help in some regimes but may underperform the matched n% mix depending on cohort β ratio-balanced augmentation is often safer.
- Trends generalize to external cohorts, indicating usability beyond the development data.
Acknowledgements
We gratefully acknowledge the open-source projects that directly informed this repository: the MAISI tutorial from the Project MONAI tutorials, the broader Project MONAI ecosystem, our related benchmark repo AI in Lung Health β Benchmarking, and our companion toolkits PiNS β Point-driven Nodule Segmentation and CaNA β Context-Aware Nodule Augmentation. We thank these communities and contributors for their exceptional open-source efforts. If you use our models or code, please also consider citing these works (alongside this repository) to acknowledge their contributions.
References
Model tree for ft42/NoMAISI
Base model
MONAI/maisi_ct_generative