The content of this repository represents work done by Przemysław Mirowski during Master's thesis titled "Generation of brain scan images from segmentation maps using diffusion models" and science article titled "Diffusion model-based synthesis of brain images for data augmentation" done at Lodz University of Technology in Poland.
This work is licensed under Creative Commons Attribution-NonCommercial 4.0 International.
@article{MIROWSKI2026108940,
title = {Diffusion model-based synthesis of brain images for data augmentation},
journal = {Biomedical Signal Processing and Control},
volume = {113},
pages = {108940},
year = {2026},
issn = {1746-8094},
doi = {https://doi.org/10.1016/j.bspc.2025.108940},
url = {https://www.sciencedirect.com/science/article/pii/S174680942501451X},
author = {Przemysław Mirowski and Anna Fabijańska},
keywords = {Brain lesion segmentation, ControlNet, Diffusion model, Image augmentation, U-Net, SPADE, Pix2Pix},
}
@mastersthesis{mirowski2024,
author={Przemysław Mirowski},
title={Generation of brain scan images from segmentation maps using diffusion models},
school={Lodz University of Technology},
year=2024
}
All content of the repository was tested on Windows 11 23H2 with Docker Desktop 4.37.1 and NVIDIA Studio Driver 566.36. Computer configuration is listed in the table below:
| Graphic card | Memory | CPU |
|---|---|---|
| NVIDIA GeForce RTX 3080 12GB | 64 GB | AMD Ryzen 7 5800X |
Below there are descriptions regarding every part of the work:
- Data preparation - focuses on preparing data for generative model training and creating sets of ids for generative and segmentation model training,
- Generative models - focuses on proposed, ControlNet, SPADE and Pix2Pix model training, data generation for evaluation and segmentation model, and evaluation of generative models
- Segmentation model - focuses on segmentation model training
All sections are separate from each other which means that when there is command execution it should be done from root repository directory.
To run scripts for data preparation you need to execute below commands:
- Move to dataset directory
cd ./dataset - Run PowerShell script (build and run docker container)
where you need to create
./run.ps1 -dataPath "C:\Users\$env:USERNAME\Desktop\data"data. Underdatadirectory you will haverawdirectory created withBraTS2021_Training_Data.tarfile downloaded and unpacked from BraTS2021 website - After finished data preparation there will be couple new directories created:
- /data/raw/extracted - there are raw data which was extracted from nii.gz files to png
- /data/metadata/dataset - there are some information regarding generated data
- /data/ids/raw - there are files with information about which patient data belongs to which set group: train, validation or test
To train proposed model you need to execute below commands:
- Move to custom model directory
cd ./generative/custom - Run PowerShell script (build and run docker container)
where you need to create
.\run.ps1 ` -dataPath "C:\Users\$env:USERNAME\Desktop\data" ` -modelPath "C:\Users\$env:USERNAME\Desktop\models\generation\custom"
generation/customdirectory undermodels. - Model training (running script instead docker container)
./src/bash/training/01_training.sh
When we will have our final model ready we can start to evaluation and generation of data for segmentation model (all command should be executed in previously created docker container):
- Data generation for reconstruction analysis
before running script you need to provide proper
./src/bash/generation/test/01_reconstruction.sh
--run_idvalue (if it is the last run it will be the newest name of the directory under/models/generation/custom/runsin docker container orC:\Users\$env:USERNAME\Desktop\models\generation\custom\runsin local). - Data generation for diversity analysis
before running script you need to provide proper
./src/bash/generation/test/02_diversity.sh
--run_idvalue (if it is the last run it will be the newest name of the directory under/models/generation/custom/runsin docker container orC:\Users\$env:USERNAME\Desktop\models\generation\custom\runsin local). By default, for diversity test there will be 1000 images generated. If you want to change that number you can modify value of--img_to_gen_per_seg_mapparameter inside the script. - Data generation for segmentation model
./src/bash/generation/seg/01_whole_train_set.sh
- Copy segmentation maps for segmentation model
./src/bash/generation/seg/02_copy_seg_masks.sh
To train ControlNet model you need to execute below commands:
- Move to ControlNet model directory
cd ./generative/generative_brain_controlnet - Run PowerShell script (build and run docker container)
where you need to create
.\run.ps1 ` -dataPath "C:\Users\$env:USERNAME\Desktop\data" ` -configPath "C:\Users\$env:USERNAME\Desktop\synthetic-brain-mri-project\generative\generative_brain_controlnet\configs" ` -artifactPath "C:\Users\$env:USERNAME\Desktop\models\generation\controlnet\artifacts" ` -modelPath "C:\Users\$env:USERNAME\Desktop\models\generation\controlnet\runs" ` -resultPath "C:\Users\$env:USERNAME\Desktop\models\generation\controlnet\results"
generation/controlnetdirectory undermodels. Under newly createdcontrolnetdirectory you need to create alsoartifacts,runsandresultsdirectories. Also, below command will work if the content of this repository will be cloned underC:\Users\$env:USERNAME\Desktoppath. - Model training - autoencoder
./src/bash/training/01_train_aekl.sh
- Model training - diffusion model
where inside the script you need to update
./src/bash/training/02_train_ldm.sh
mlrun_idparameter with run_id which was printed out in console during autoencoder training. - Model training - ControlNet
where inside the script you need to update
./src/bash/training/03_train_controlnet.sh
stage1_mlrun_id(autoencoder) andldm_mlrun_id(diffusion model) parameters with run_id values printed during training of autoencoder and diffusion model.
When we will have our final model ready we can start to evaluation and generation of data for segmentation model (all command should be executed in previously created docker container):
- Conversion of MLFlow models to PyTorch
where inside the script you need to update
./src/bash/training/04_convert_mlflow_to_pytorch.sh
stage1_mlrun_id(autoencoder),ldm_mlrun_id(diffusion model) andcontrolnet_mlrun_id(ControlNet) parameters with run_id values printed during training of autoencoder, diffusion and ControlNet model. - Data generation for reconstruction analysis
./src/bash/generation/test/01_reconstruction.sh
- Data generation for diversity analysis
./src/bash/generation/test/02_diversity.sh
- Data generation for segmentation model
./src/bash/generation/seg/01_whole_train_set.sh
- Copy segmentation maps for segmentation model
./src/bash/generation/seg/02_copy_seg_masks.sh
To train SPADE model you need to execute below commands:
- Move to custom model directory
cd ./generative/spade - Run PowerShell script (build and run docker container)
where you need to create
.\run.ps1 ` -dataPath "C:\Users\$env:USERNAME\Desktop\data" ` -modelPath "C:\Users\$env:USERNAME\Desktop\models\generation\spade"
generation/spadedirectory undermodels. - Model training (running script instead docker container)
./src/bash/training/01_training.sh
When we will have our final model ready we can start to evaluation and generation of data for segmentation model (all command should be executed in previously created docker container):
- Data generation for reconstruction analysis
before running script you need to provide proper
./src/bash/generation/test/01_reconstruction.sh
--nameand--which_epochvalues (if it is the last run it will be the newest name of the directory under/models/generation/spade/runsfor--nameparameter and under/models/generation/spade/runs/<name>/epochsfor--which_epochparameter in docker container orC:\Users\$env:USERNAME\Desktop\models\generation\spade\runsin local). - Data generation for diversity analysis
before running script you need to provide proper
./src/bash/generation/test/02_diversity.sh
--nameand--which_epochvalues (if it is the last run it will be the newest name of the directory under/models/generation/spade/runsfor--nameparameter and under/models/generation/spade/runs/<name>/epochsfor--which_epochparameter in docker container orC:\Users\$env:USERNAME\Desktop\models\generation\spade\runsin local). - Data generation for segmentation model
./src/bash/generation/seg/01_whole_train_set.sh
- Copy segmentation maps for segmentation model
./src/bash/generation/seg/02_copy_seg_masks.sh
To train Pix2Pix model you need to execute below commands:
- Move to custom model directory
cd ./generative/pix2pix - Run PowerShell script (build and run docker container)
where you need to create
.\run.ps1 ` -dataPath "C:\Users\$env:USERNAME\Desktop\data" ` -modelPath "C:\Users\$env:USERNAME\Desktop\models\generation\pix2pix"
generation/pix2pixdirectory undermodels. - Model training (running script instead docker container)
./src/bash/training/01_training.sh
When we will have our final model ready we can start to evaluation and generation of data for segmentation model (all command should be executed in previously created docker container):
- Data generation for reconstruction analysis
before running script you need to provide proper
./src/bash/generation/test/01_reconstruction.sh
--nameand--epochvalues (if it is the last run it will be the newest name of the directory under/models/generation/pix2pix/runsfor--nameparameter and under/models/generation/pix2pix/runs/<name>/epochsfor--epochparameter in docker container orC:\Users\$env:USERNAME\Desktop\models\generation\pix2pix\runsin local). - Data generation for diversity analysis
before running script you need to provide proper
./src/bash/generation/test/02_diversity.sh
--nameand--epochvalues (if it is the last run it will be the newest name of the directory under/models/generation/pix2pix/runsfor--nameparameter and under/models/generation/pix2pix/runs/<name>/epochsfor--epochparameter in docker container orC:\Users\$env:USERNAME\Desktop\models\generation\pix2pix\runsin local). - Data generation for segmentation model
./src/bash/generation/seg/01_whole_train_set.sh
- Copy segmentation maps for segmentation model
./src/bash/generation/seg/02_copy_seg_masks.sh
To run proposed, ControlNet, SPADE and Pix2Pix models evaluation (calculation of FID and MS-SSIM scores) you need to execute below commands:
- Move to testing directory
cd ./generative/testing - Run PowerShell command (build and run docker container)
./run.ps1 ` -dataPath "C:\Users\$env:USERNAME\Desktop\data" ` -modelsGenPath "C:\Users\$env:USERNAME\Desktop\models\generation"
- Run below command to generate MS-SSIM (reconstruction)
./src/bash/testing/01_reconstruction_ms-ssim.sh
- Run below command to generate FID (reconstruction)
./src/bash/testing/02_reconstruction_fid.sh
- Run below command to generate MS-SSIM (diversity) for proposed model
./src/bash/testing/03_diversity_ms-ssim.sh
After finished evaluation, under the directory /data/metadata/generation (from the container perspective) the
metrics.json file will be created. It contains all results from the evaluation of all generative models.
To train segmentation model you need to execute below commands:
- Move to custom model directory
cd ./segmentation - Run PowerShell script (build and run docker container)
.\run.ps1 ` -dataPath "C:\Users\$env:USERNAME\Desktop\data" ` -modelsPath "C:\Users\$env:USERNAME\Desktop\models\segmentation\artifacts" ` -resultsPath "C:\Users\$env:USERNAME\Desktop\models\segmentation\results"
- Start training of segmentation model
./bash/01_training.sh
To evaluate segmentation models you need to execute below commands:
- Move to custom model directory
cd ./segmentation - Run PowerShell script (build and run docker container)
.\run.ps1 ` -dataPath "C:\Users\$env:USERNAME\Desktop\data" ` -modelsPath "C:\Users\$env:USERNAME\Desktop\models\segmentation\artifacts" ` -resultsPath "C:\Users\$env:USERNAME\Desktop\models\segmentation\results"
- Start training of segmentation model
./bash/02_evaluation.sh