@@ -426,6 +426,106 @@ the resulting core affinity of the OpenMP threads are:
426426 Hello from rank 7, thread 0, on nid00026. (core affinity = 18)
427427 Hello from rank 7, thread 1, on nid00026. (core affinity = 19)
428428
429+
430+ Slurm with GPUs examples
431+ ^^^^^^^^^^^^^^^^^^^^^^^^
432+
433+ .. note ::
434+
435+ New in 0.8.0
436+
437+ The :py:meth: `~ipsframework.services.ServicesProxy.launch_task ` method
438+ has an option ``task_gpp `` which allows you to set the number of GPUs
439+ per process, used as the ``--gpus-per-task `` in the ``srun ``
440+ command.
441+
442+ IPS will validate the number of GPUs per node requested does not
443+ exceed the number specified by the ``GPUS_PER_NODE `` parameter in the
444+ :ref: `plat-conf-sec `. You need to make sure that the number of GPUs
445+ per process times the number of processes per node does not exceed the
446+ ``GPUS_PER_NODE `` set.
447+
448+ Using the `gpus_for_tasks
449+ <https://docs.nersc.gov/jobs/affinity/#gpus> `_ program provided for
450+ Perlmutter (which has 4 GPUs per node) to test the behavior, you will
451+ see the following:
452+
453+
454+ To launch a task with 1 process and 1 GPU per process (``task_gpp ``) run:
455+
456+ .. code-block :: python
457+
458+ self .services.launch_task(1 , cwd, " gpu-per-task" , task_gpp = 1 )
459+
460+ will create the command ``srun -N 1 -n 1 -c
461+ 64 --threads-per-core=1 --cpu-bind=cores --gpus-per-task=1
462+ gpus_for_tasks `` and the output of will be:
463+
464+ .. code-block :: text
465+
466+ Rank 0 out of 1 processes: I see 1 GPU(s).
467+ 0 for rank 0: 0000:03:00.0
468+
469+ To launch 8 processes on 2 nodes (so 4 processes per node) with 1 gpu per process run:
470+
471+ .. code-block :: python
472+
473+ self .services.launch_task(8 , cwd, " gpu-per-task" , task_ppn = 4 , task_gpp = 1 )
474+
475+ will create the command ``srun -N 2 -n 8 -c
476+ 16 --threads-per-core=1 --cpu-bind=cores --gpus-per-task=1
477+ gpus_for_task `` and the output of will be:
478+
479+ .. code-block :: text
480+
481+ Rank 0 out of 8 processes: I see 1 GPU(s).
482+ 0 for rank 0: 0000:03:00.0
483+ Rank 1 out of 8 processes: I see 1 GPU(s).
484+ 0 for rank 1: 0000:41:00.0
485+ Rank 2 out of 8 processes: I see 1 GPU(s).
486+ 0 for rank 2: 0000:82:00.0
487+ Rank 3 out of 8 processes: I see 1 GPU(s).
488+ 0 for rank 3: 0000:C1:00.0
489+ Rank 4 out of 8 processes: I see 1 GPU(s).
490+ 0 for rank 4: 0000:03:00.0
491+ Rank 5 out of 8 processes: I see 1 GPU(s).
492+ 0 for rank 5: 0000:41:00.0
493+ Rank 6 out of 8 processes: I see 1 GPU(s).
494+ 0 for rank 6: 0000:82:00.0
495+ Rank 7 out of 8 processes: I see 1 GPU(s).
496+ 0 for rank 7: 0000:C1:00.0
497+
498+ To launch 2 processes on 2 nodes (so 1 processes per node) with 4 gpu per process run:
499+
500+ .. code-block :: python
501+
502+ self .services.launch_task(2 , cwd, " gpu-per-task" , task_ppn = 1 , task_gpp = 4 )
503+
504+ will create the command ``srun -N 2 -n 2 -c
505+ 64 --threads-per-core=1 --cpu-bind=cores --gpus-per-task=4
506+ gpus_per_tasks `` and the output of will be:
507+
508+ .. code-block :: text
509+
510+ Rank 0 out of 2 processes: I see 4 GPU(s).
511+ 0 for rank 0: 0000:03:00.0
512+ 1 for rank 0: 0000:41:00.0
513+ 2 for rank 0: 0000:82:00.0
514+ 3 for rank 0: 0000:C1:00.0
515+ Rank 1 out of 2 processes: I see 4 GPU(s).
516+ 0 for rank 1: 0000:03:00.0
517+ 1 for rank 1: 0000:41:00.0
518+ 2 for rank 1: 0000:82:00.0
519+ 3 for rank 1: 0000:C1:00.0
520+
521+ If you try to launch a task with too many GPUs per node, *e.g. *:
522+
523+ .. code-block :: python
524+
525+ self .services.launch_task(8 , cwd, " gpu-per-task" , task_gpp = 1 )
526+
527+ then it will raise an :class: `~ipsframework.ipsExceptions.GPUResourceRequestMismatchException `.
528+
429529.. automethod :: ipsframework.services.ServicesProxy.launch_task
430530 :noindex:
431531
0 commit comments