NoMAISI: Nodule-Oriented Medical AI for Synthetic Imaging

PiNS Logo

Nodule-Oriented Medical AI for Synthetic Imaging and Augmentation in Chest CT

License: CC BY-NC 4.0 Docker Python Medical Imaging PyTorch MONAI PiNS CaNA

Abstract

Medical imaging datasets are increasingly available, yet abnormal and annotation-intensive cases such as lung nodules remain underrepresented. We Introduced NoMAISI (Nodule-Oriented Medical AI for Synthetic Imaging), a generative framework built on foundational backbones with flow-based diffusion and ControlNet conditioning. Using NoMAISI, we curated a large multi-cohort lung nodule dataset and applied context-aware nodule volume augmentation, including relocation, shrinkage to simulate early-stage disease, and expansion to model progression. Each case was rendered into multiple synthetic variants, producing a diverse and anatomically consistent dataset. Fidelity was evaluated with cross-cohort similarity metrics, and downstream integration into lung nodule detection, and classification tasks demonstrated improved external test performance, particularly in underrepresented lesion categories. These results show that nodule-oriented synthetic imaging and curated augmentation can complement clinical data, reduce annotation demands, and expand the availability of training resources for healthcare AI.

🧩 Workflow Overview

The overall pipeline for organ, body, and nodule segmentation with alignment is shown below:

Segmentation Pipeline

Workflow for constructing the NoMAISI development dataset. The pipeline includes (1) organ segmentation using AI models, (2) body segmentation with algorithmic methods, (3) nodule segmentation through AI-assisted and ML-based refinement, and (4) segmentation alignment to integrate organs, body, and nodules segmentations into anatomically consistent volumes.

NoMAISI_train_and_infer

Overview of our flow-based latent diffusion model with ControlNet conditioning for AI-based CT generation. The pipeline consists of three stages: (top) Pretrained VAE for image compression, where CT images are encoded into latent features using a frozen VAE; (middle) Model fine-tuning, where a Rectified Flow ODE sampler, conditioned on segmentation masks and voxel spacing through a fine-tuned ControlNet, predicts velocity fields in latent space and is optimized with a region-specific contrastive loss emphasizing ROI sensitivity and background consistency; and (bottom) Inference, where segmentation masks and voxel spacing guide latent sampling along the ODE trajectory to obtain a clean latent representation, which is then decoded by the VAE into full-resolution AI-generated CT images conditioned by body and lesion masks.

πŸ“Š Dataset Composition

The table below summarizes the datasets included in this project, with their split sizes (Patients, CT scans, and Nodules) and the annotation types available.

Dataset Patients
n (%)
CT Scans
n (%)
Nodules
n (%)
Organ Seg Nodule Seg Nodule CCC Nodule Box
LNDbv4 223 (3.17) 223 (2.52) 1132 (7.84) βœ— βœ“ βœ— βœ“
NSCLC-R 415 (5.89) 415 (4.69) 415 (2.87) βœ— βœ“ βœ— βœ“
LIDC-IDRI 870 (12.35) 870 (9.84) 2584 (17.89) βœ— βœ“ βœ“ βœ“
DLCS-24 1605 (22.79) 1605 (18.15) 2478 (17.16) βœ— βœ“ βœ— βœ“
Intgmultiomics 1936 (27.49) 1936 (21.90) 1936 (13.40) βœ— βœ“ βœ— βœ—
LUNA-25 1993 (28.30) 3792 (42.89) 5899 (40.84) βœ— βœ“ βœ— βœ“
TOTAL 7042 (100) 8841 (100) 14444 (100) β€” β€” β€” β€”

Notes

  • Percentages indicate proportion relative to the total for each column.
  • βœ”οΈŽ = annotation available, βœ— = annotation not available.
  • β€œNodule CCC” = nodule center coordinates.
  • β€œNodule Box” = bounding-box annotations.

πŸ“š Dataset citations References

AI-Generated CT Evaluations

πŸ“‰ FrΓ©chet Inception Distance (FID) Results

FrΓ©chet Inception Distance (FID) of the MAISI-v2 baseline and NoMAISI models with multiple public clinical datasets (test dataset) as the references (Lower is better).

FID (Avg.) LNDbv4 NSCLC-R LIDC-IDRI DLCS-24 Intgmultiomics LUNA-25
Real LNDbv4 β€” 5.13 1.49 1.05 2.40 1.98
Real NSCLC-R 5.13 β€” 3.12 3.66 1.56 2.65
Real LIDC-IDRI 1.49 3.12 β€” 0.79 1.44 0.75
Real DLCS-24 1.05 3.66 0.79 β€” 1.56 1.00
Real Intgmultiomics 2.40 1.56 1.44 1.56 β€” 1.57
Real LUNA-25 1.98 2.65 0.75 1.00 1.57 β€”
AI-Generated MAISI-V2 3.15 5.21 2.70 2.32 2.82 1.69
AI-Generated NoMAISI (ours) 2.99 3.05 2.31 2.27 2.62 1.18

πŸ“‰ FID Parity Plot

Parity comparison of FID for real↔real vs AI-generated CT across datasets

Comparison of FrΓ©chet Inception Distance (FID) between real↔real and AI-generated CT datasets. Each point represents a clinical dataset (LNDbv4, NSCLC-R, LIDC-IDRI, DLCS24, Intgmultiomics, LUNA25) under different generative models (MAISI-V2, NoMAISI).The x-axis shows the median FID computed between real datasets, while the y-axis shows the FID of AI-generated data compared to real.
The dashed diagonal line denotes parity (y = x), where AI-generated fidelity would match real↔real fidelity.

πŸ–ΌοΈ Example Results

Comparison of CT generation from anatomical masks.

  • Left: Input organ/body segmentation mask.
  • Middle: Generated CT slice using MAISI-V2.
  • Right: Generated CT slice using NoMAISI (ours).
  • Yellow boxes highlight lung nodule regions for comparison.

Comparison of MAISI-V2 vs NoMAISI on lung CT with input masks

Comparison of MAISI-V2 vs NoMAISI on lung CT with input masks

Comparison of MAISI-V2 vs NoMAISI on lung CT with input masks

Inference Guide

  1. Project Structure
  2. Configuration Files

Model Weights

Model weights are available upon request. Please email the authors: [email protected].

πŸ“ Project Structure

NoMAISI/
β”œβ”€β”€ configs/                          # Configuration files
β”‚   β”œβ”€β”€ config_maisi3d-rflow.json    # Main model configuration
β”‚   β”œβ”€β”€ infr_env_NoMAISI_DLCSD24_demo.json  # Environment settings
β”‚   └── infr_config_NoMAISI_controlnet.json # ControlNet inference config
β”œβ”€β”€ scripts/                          # Python inference scripts
β”‚   β”œβ”€β”€ infer_testV2_controlnet.py   # Main inference script
β”‚   β”œβ”€β”€ infer_controlnet.py          # ControlNet inference
β”‚   └── utils.py                     # Utility functions
β”œβ”€β”€ models/                           # Pre-trained model weights
β”œβ”€β”€ data/                            # Input data directory
β”œβ”€β”€ outputs/                         # Generated results
β”œβ”€β”€ logs/                           # Execution logs
└── inference.sub                   # SLURM job script

βš™οΈ Configuration Files

1. Main Model Configuration (config_maisi3d-rflow.json): Controls the core diffusion model parameters:

  • Model architecture settings; Sampling parameters; Image dimensions and spacing

2. Environment Configuration (infr_env_NoMAISI_DLCSD24_demo.json): Defines runtime environment

  • Data paths and directories; GPU settings; Memory allocation

3. ControlNet Configuration (infr_config_NoMAISI_controlnet.json): ControlNet-specific settings

  • Conditioning parameters; Generation controls; Output specifications

πŸš€ Running Inference

cd /path/NoMAISI/
# Create logs directory if it doesn't exist
mkdir -p logs
# Submit job to SLURM
sbatch inference.sub
# Run inference directly
cd /path/NoMAISI/
python -m scripts.infer_testV2_controlnet \
    -c ./configs/config_maisi3d-rflow.json \
    -e ./configs/infr_env_NoMAISI_DLCSD24_demo.json \
    -t ./configs/infr_config_NoMAISI_controlnet.json

Downstream Task:

  • Cancer vs. No-Cancer Classification
  • Nodule Detection Coming Soon.
  • Nodule Segmentation Coming Soon.


πŸ”¬ Downstream Task: Cancer vs. No-Cancer Classification

Cancer/No-Cancer Classification Results

Shown. AUC vs. the % of clinical data retained (x-axis: 100%, 50%, 20%, 10%). Curves (additive augmentation β€” we add AI-generated nodules; we never replace clinical samples):

  • Clinical (LUNA25) β€” baseline using only the retained clinical data.
  • Clinical + AI-gen. (n%) β€” at each point, add AI-generated data equal to the same percentage as the retained clinical fraction.
    Examples: at 50% clinical β†’ +50% AI-gen; 20% β†’ +20%; 10% β†’ +10%.
  • Clinical + AI-gen. (100%) β€” at each point, add AI-generated data equal to 100% of the full clinical dataset size, regardless of the retained fraction.
    Example: at 10% clinical β†’ +100% AI-gen.

Takeaways

  • AI-generated nodules improve data-efficiency: at low clinical fractions (50%β†’10%), Clinical + AI-gen. (n%) typically matches or exceeds clinical-only AUC.
  • Bigger synthetic boosts (100%) can help in some regimes but may underperform the matched n% mix depending on cohort β†’ ratio-balanced augmentation is often safer.
  • Trends generalize to external cohorts, indicating usability beyond the development data.

Acknowledgements

We gratefully acknowledge the open-source projects that directly informed this repository: the MAISI tutorial from the Project MONAI tutorials, the broader Project MONAI ecosystem, our related benchmark repo AI in Lung Health – Benchmarking, and our companion toolkits PiNS – Point-driven Nodule Segmentation and CaNA – Context-Aware Nodule Augmentation. We thank these communities and contributors for their exceptional open-source efforts. If you use our models or code, please also consider citing these works (alongside this repository) to acknowledge their contributions.

References

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for ft42/NoMAISI

Adapter
(1)
this model