Describe the bug
I just tried to get some sample code for #6626 but ran into a warning I have seen many times before. The problem appears when the transform pushed code the GPU and the data is then handed over from the Dataloader Thread to the main Thread.
This is no hard bug but it is very annoying since it gets spammed a lot.
Temporary workaround which I found is to add "persistent_workers=True," to the DataLoader, then the warning gets only shown at the end of the program, sometimes never.
Warning message:
[W CUDAGuardImpl.h:46] Warning: CUDA warning: driver shutting down (function uncheckedGetDevice)
[W CUDAGuardImpl.h:62] Warning: CUDA warning: driver shutting down (function uncheckedSetDevice)
[W CUDAGuardImpl.h:46] Warning: CUDA warning: driver shutting down (function uncheckedGetDevice)
[W CUDAGuardImpl.h:62] Warning: CUDA warning: driver shutting down (function uncheckedSetDevice)
[W CUDAGuardImpl.h:46] Warning: CUDA warning: driver shutting down (function uncheckedGetDevice)
[W CUDAGuardImpl.h:62] Warning: CUDA warning: driver shutting down (function uncheckedSetDevice)
[W CUDAGuardImpl.h:46] Warning: CUDA warning: driver shutting down (function uncheckedGetDevice)
[W CUDAGuardImpl.h:62] Warning: CUDA warning: driver shutting down (function uncheckedSetDevice)
[W CUDAGuardImpl.h:46] Warning: CUDA warning: driver shutting down (function uncheckedGetDevice)
[W CUDAGuardImpl.h:62] Warning: CUDA warning: driver shutting down (function uncheckedSetDevice)
[W CUDAGuardImpl.h:46] Warning: CUDA warning: driver shutting down (function uncheckedGetDevice)
[W CUDAGuardImpl.h:62] Warning: CUDA warning: driver shutting down (function uncheckedSetDevice)
To Reproduce
Run this code, minimal sample:
import torch
from torch import optim, nn
from monai.engines import SupervisedTrainer
from monai.data import DataLoader, ArrayDataset
import gc
from monai.networks.nets import UNet
from monai.inferers import SimpleInferer, SlidingWindowInferer
from monai.networks.nets.dynunet import DynUNet
from monai.engines import SupervisedEvaluator, SupervisedTrainer
import monai.transforms as mt
NETWORK_INPUT_SHAPE = (1, 128, 128, 256)
NUM_IMAGES = 50
def get_xy():
xs = [256 * torch.rand(NETWORK_INPUT_SHAPE) for _ in range(NUM_IMAGES)]
ys = [torch.rand(NETWORK_INPUT_SHAPE) for _ in range(NUM_IMAGES)]
return xs, ys
transform = mt.Compose([
mt.ToDevice(device="cuda")
])
def get_data_loader():
x, y = get_xy()
dataset = ArrayDataset(x, seg=y, img_transform=transform, seg_transform=transform)
loader = DataLoader(dataset, num_workers=1, batch_size=1, multiprocessing_context='spawn')
return loader
def get_model():
return DynUNet(
spatial_dims=3,
in_channels=1,
out_channels=1,
kernel_size=[3, 3, 3, 3, 3 ,3],
strides=[1, 2, 2, 2, 2, [2, 2, 1]],
upsample_kernel_size=[2, 2, 2, 2, [2, 2, 1]],
norm_name="instance",
deep_supervision=False,
res_block=True,
).to(device=device)
if __name__ == "__main__":
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
train_loader = get_data_loader()
model = get_model()
MAX_EPOCHS = 2
optimizer = optim.Adam(model.parameters())
inferer = SlidingWindowInferer(roi_size=(64, 64, 64), sw_batch_size=10, mode="gaussian")
trainer = SupervisedTrainer(
device=device,
max_epochs=MAX_EPOCHS,
amp=True,
train_data_loader=train_loader,
network=model,
optimizer=optimizer,
inferer=inferer,
loss_function=nn.CrossEntropyLoss(),
prepare_batch=lambda batchdata, device, non_blocking: (
batchdata[0].to(device),
batchdata[1].squeeze(1).to(device, dtype=torch.long),
),
)
trainer.run()
Expected behavior
No Cuda Warnings
Environment
Verified on different environments.
================================
Printing MONAI config...
================================
MONAI version: 1.1.0
Numpy version: 1.23.5
Pytorch version: 1.13.1+cu117
MONAI flags: HAS_EXT = False, USE_COMPILED = False, USE_META_DICT = False
MONAI rev id: a2ec3752f54bfc3b40e7952234fbeb5452ed63e3
MONAI __file__: /home/matteo/anaconda3/envs/monai/lib/python3.9/site-packages/monai/__init__.py
Optional dependencies:
Pytorch Ignite version: 0.4.10
Nibabel version: 5.0.1
scikit-image version: 0.20.0
Pillow version: 9.5.0
Tensorboard version: 2.12.1
gdown version: 4.7.1
TorchVision version: 0.14.0+cu117
tqdm version: 4.64.1
lmdb version: 1.4.0
psutil version: 5.9.4
pandas version: 1.5.3
einops version: 0.6.0
transformers version: 4.21.3
mlflow version: 2.2.2
pynrrd version: 1.0.0
For details about installing the optional dependencies, please visit:
https://docs.monai.io/en/latest/installation.html#installing-the-recommended-dependencies
================================
Printing system config...
================================
System: Linux
Linux version: Ubuntu 22.04.2 LTS
Platform: Linux-5.19.0-45-generic-x86_64-with-glibc2.35
Processor: x86_64
Machine: x86_64
Python version: 3.9.16
Process name: python
Command: ['python', '-c', 'import monai; monai.config.print_debug_info()']
Open files: []
Num physical CPUs: 12
Num logical CPUs: 24
Num usable CPUs: 24
CPU usage (%): [4.1, 3.6, 4.2, 3.6, 3.6, 3.7, 3.6, 3.1, 3.6, 4.1, 5.2, 99.5, 4.1, 3.6, 4.6, 3.6, 3.6, 3.6, 3.6, 3.6, 3.6, 5.2, 3.6, 4.2]
CPU freq. (MHz): 3687
Load avg. in last 1, 5, 15 mins (%): [6.3, 7.6, 7.5]
Disk usage (%): 24.8
Avg. sensor temp. (Celsius): UNKNOWN for given OS
Total physical memory (GB): 31.2
Available memory (GB): 26.9
Used memory (GB): 3.9
================================
Printing GPU config...
================================
Num GPUs: 1
Has CUDA: True
CUDA version: 11.7
cuDNN enabled: True
cuDNN version: 8500
Current device: 0
Library compiled for CUDA architectures: ['sm_37', 'sm_50', 'sm_60', 'sm_70', 'sm_75', 'sm_80', 'sm_86']
GPU 0 Name: NVIDIA GeForce RTX 3090 Ti
GPU 0 Is integrated: False
GPU 0 Is multi GPU board: False
GPU 0 Multi processor count: 84
GPU 0 Total memory (GB): 22.2
GPU 0 CUDA capability (maj.min): 8.6
Additional context
Adding an evaluator further complicates the warnings and a new warning is now shown:
[W CUDAGuardImpl.h:46] Warning: CUDA warning: driver shutting down (function uncheckedGetDevice)
[W CUDAGuardImpl.h:62] Warning: CUDA warning: driver shutting down (function uncheckedSetDevice)
[W CUDAGuardImpl.h:46] Warning: CUDA warning: driver shutting down (function uncheckedGetDevice)
[W CUDAGuardImpl.h:62] Warning: CUDA warning: driver shutting down (function uncheckedSetDevice)
[W CUDAGuardImpl.h:46] Warning: CUDA warning: driver shutting down (function uncheckedGetDevice)
[W CUDAGuardImpl.h:62] Warning: CUDA warning: driver shutting down (function uncheckedSetDevice)
[W CUDAGuardImpl.h:46] Warning: CUDA warning: driver shutting down (function uncheckedGetDevice)
[W CUDAGuardImpl.h:62] Warning: CUDA warning: driver shutting down (function uncheckedSetDevice)
[W CudaIPCTypes.cpp:15] Producer process has been terminated before all shared CUDA tensors released. See Note [Sharing CUDA tensors]
The code for that:
import torch
from torch import optim, nn
from monai.engines import SupervisedTrainer
from monai.data import DataLoader, ArrayDataset
import gc
from monai.networks.nets import UNet
from monai.inferers import SimpleInferer, SlidingWindowInferer
from monai.networks.nets.dynunet import DynUNet
from monai.handlers import (
CheckpointSaver,
LrScheduleHandler,
MeanDice,
StatsHandler,
TensorBoardStatsHandler,
ValidationHandler,
from_engine,
GarbageCollector,
)
from monai.engines import SupervisedEvaluator, SupervisedTrainer
import monai.transforms as mt
NETWORK_INPUT_SHAPE = (1, 128, 128, 256)
NUM_IMAGES = 50
def get_xy():
xs = [256 * torch.rand(NETWORK_INPUT_SHAPE) for _ in range(NUM_IMAGES)]
ys = [torch.rand(NETWORK_INPUT_SHAPE) for _ in range(NUM_IMAGES)]
return xs, ys
transform = mt.Compose([
mt.ToDevice(device="cuda")
])
def get_data_loader():
x, y = get_xy()
dataset = ArrayDataset(x, seg=y, img_transform=transform, seg_transform=transform)
loader = DataLoader(dataset, num_workers=1, batch_size=1, multiprocessing_context='spawn')
return loader
def get_model():
return DynUNet(
spatial_dims=3,
# 1 dim for the image, the other ones for the signal per label with is the size of image
in_channels=1,
out_channels=1,
kernel_size=[3, 3, 3, 3, 3 ,3],
strides=[1, 2, 2, 2, 2, [2, 2, 1]],
upsample_kernel_size=[2, 2, 2, 2, [2, 2, 1]],
norm_name="instance",
deep_supervision=False,
res_block=True,
).to(device=device)
if __name__ == "__main__":
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
train_loader = get_data_loader()
model = get_model()
MAX_EPOCHS = 2
optimizer = optim.Adam(model.parameters())
inferer = SlidingWindowInferer(roi_size=(64, 64, 64), sw_batch_size=10, mode="gaussian")
val_inferer = SlidingWindowInferer(roi_size=(64, 64, 64), sw_batch_size=10, mode="gaussian")
val_handlers = [
StatsHandler(output_transform=lambda x: None),
]
evaluator = SupervisedEvaluator(
device=device,
amp=True,
val_data_loader=train_loader,
network=model,
inferer=val_inferer,
prepare_batch=lambda batchdata, device, non_blocking: (
batchdata[0].to(device),
batchdata[1].squeeze(1).to(device, dtype=torch.long),
),
val_handlers = val_handlers,
)
lr_scheduler = torch.optim.lr_scheduler.PolynomialLR(optimizer, total_iters=MAX_EPOCHS, power = 2)
train_handlers = [
ValidationHandler(
validator=evaluator, interval=1, epoch_level=True,
),
LrScheduleHandler(lr_scheduler=lr_scheduler,
print_lr=True,
),
]
trainer = SupervisedTrainer(
device=device,
max_epochs=MAX_EPOCHS,
amp=True,
train_data_loader=train_loader,
network=model,
optimizer=optimizer,
inferer=inferer,
loss_function=nn.CrossEntropyLoss(),
prepare_batch=lambda batchdata, device, non_blocking: (
batchdata[0].to(device),
batchdata[1].squeeze(1).to(device, dtype=torch.long),
),
train_handlers=train_handlers,
)
trainer.run()
Describe the bug
I just tried to get some sample code for #6626 but ran into a warning I have seen many times before. The problem appears when the transform pushed code the GPU and the data is then handed over from the Dataloader Thread to the main Thread.
This is no hard bug but it is very annoying since it gets spammed a lot.
Temporary workaround which I found is to add "persistent_workers=True," to the DataLoader, then the warning gets only shown at the end of the program, sometimes never.
Warning message:
To Reproduce
Run this code, minimal sample:
Expected behavior
No Cuda Warnings
Environment
Verified on different environments.
Additional context
Adding an evaluator further complicates the warnings and a new warning is now shown:
The code for that: