Multiple tasks on a job

A task is a running instance of a program which matches the definition of a “process”. Tasks for each step are spawned by using the srun command.

The number of tasks (processes) spawned is set by --ntasks. Each process will be allocated the number of CPUs set by --cpus-per-task and the memory obtained by the multiplication of the value of the --mem-per-cpu option by the number of CPUs allocated per task.

Tasks cannot be split across several compute nodes, so all CPUs requested with the --cpus-per-task option will be located on the same compute node. By contrast, CPUs allocated to distinct tasks can end up on distinct nodes.

Running multiple programs with --multi-prog option

The --multi-prog option of the srun command allows to run multiple programs within a single job.

Information about this option can be obtained by running man srun and searching for “multi-prog” in the displayed manual. There are also some examples scripts in FinisTerrae III path: cd /opt/cesga/job-scripts-examples-ft3/Simple_Multiprog_Job.sh cd /opt/cesga/job-scripts-examples-ft3/Simple_Multiprog_Job.sh

Job submission

To send a job with sbatch it’s mandatory to create the script we are going to use. In the following example, we are requesting 8 cores in 1 node (total of 8 tasks) but it would be easily adaptable to the desired number of nodes and cores using the options and parameters of sbatch.

The script test.sh would be something like this:

#!/bin/sh
#SBATCH -n 8 # 8 cores
#SBATCH -t 10:00:00 # 10 hours
#SBATCH --mem=10GB
srun –multi-prog test.config

Configuration file The test.config file contains the parameters required by the multi-prog option. The configuration file contains 3 fields, separated by spaces:

-Number of tasks -The program or script to run -Arguments or directives -Available parameters:

-%t: The task number of the responsible task -%o: The task offset (task’s relative position in the task range).

The test.config would be something like this:

# srun multiple program configuration file
#
# srun -n8 -l –multi-prog test.config

0 /home/cesga/user/multiprog/script.sh %t 1 /home/cesga/user/multiprog/script.sh %t 2 /home/cesga/user/multiprog/script.sh %t 3 /home/cesga/user/multiprog/script.sh %t 4 /home/cesga/user/multiprog/script.sh %t 5 /home/cesga/user/multiprog/script.sh %t 6 /home/cesga/user/multiprog/script.sh %t 7 /home/cesga/user/multiprog/script.sh %t

Note

Note that in the case of using your own executable/script as in this case, you must indicate the full path to the file.

The number of tasks indicated in this configuration file must be exactly the same as the number of tasks requested through sbatch.

In this example, we indicate that we want to execute a script (script.sh) that requires the use of a core and the task number is passed as an argument, which we can use in a similar way as we do with job-arrays to identify any parameters needed in execution.

It is possible to indicate that the script uses more than one core for its execution using the option -c NCORES and it is also possible to indicate different programs/scripts within this configuration file where a different script is used for each entry. Examples of this can be found in the information indicated at the beginning of this section.

For the example shown above, since it involves executing the same script for all the tasks, we could simplify it by putting a single line as follows: 0-7 /home/cesga/prey/multiprog/script.sh %t

#!/bin/sh
echo “TAREA=$1”

The output would be: TAREA=7 TAREA=0 TAREA=1 TAREA=4 TAREA=5 TAREA=3 TAREA=2 TAREA=6

Note

The order in which tasks are reflected can be different between runs of the same job.

Running multistep programs and -r, –relative=<n>

Running different programs at the same time on different resources within a job. Multiple srun can run at the same time within a given job as long as they do not exceed the resources reserved for that job.

#!/bin/bash
#SBATCH -n 8 # 8 tasks
#SBATCH –ntasks-per-node=4 # 2 Nodes
#SBATCH -t 10:00 # 10 min
#SBATCH --mem=10GB
#STEP 0
srun -n4 sleep 60 &
#STEP 1
srun -n4 sleep 60 &
wait

In this case, the 4 tasks of step 0 will be executed with a block distribution, 3 tasks in the first node and one in the second, and the 4 tasks of step 1 will wait for the finish of the tasks of step 0 and will be executed with the same distribution on the 2 nodes assigned to the job.

#STEP 0
srun -N1 -n4 sleep 60 &
#STEP 1
srun -N1 -r1 -n4 sleep 60 &

However with the modification above, the tasks of step 0 will be executed in the first node and those of the second step in the second node. Note the need to specify -N1 to define the number of nodes where the tasks of step 0 will be executed within the job resources and -N1 -r1 in the case of step 1 to define the number of nodes and their relative position nodes within the resources assigned to the job.

-r, –relative=<n> Run a job step relative to node n of the current allocation. This option may be used to spread several job steps out among the nodes of the current job. If -r is used, the current job step will begin at node n of the allocated nodelist, where the first node is considered node 0. The -r option is not permitted with -w or -x option and will result in a fatal error when not running within a prior allocation (when SLURM_JOB_ID is not set). The default for n is 0. If the value of nodes exceeds the number of nodes identified with the --relative option, a warning message will be printed and the –relative option will take precedence.

Combined multiprog/multistep example: /opt/cesga/job-scripts-examples-ft3/Multiprog_Job.sh

Running multiple programs within a job using GNU Parallel and srun

When the number of tasks to be carried out is very large and these tasks do not all last the same amount of time, the previous option does not seem to work out well but rather the joint use of GNU Parallel and srun would be more convenient. With this submission method, a single job will be submitted with a chosen number of tasks (using the --ntasks=X option) and the parallel command will be used to run that number of tasks simultaneously until all tasks have completed. Every time you finish a task, another will be launched without having to wait for all of them to finish.

You can find more information here.

Regardless of the resources requested in the job, the recommended srun and GNUparallel options are:

MEMPERCORE=$(eval $(scontrol show partition $SLURM_JOB_PARTITION -o);
echo $DefMemPerCPU)

#SRUN=”srun -N1 -n1 –mem=$(( $MEMPERCORE*$OMP_NUM_THREADS )) -c $OMP_NUM_THREADS –cpu_bind=none”

SRUN=”srun –exclusive -N1 -n1 –mem=$(( $MEMPERCORE*$OMP_NUM_THREADS )) -c $OMP_NUM_THREADS”

# –delay .2 prevents overloading the controlling node
# -j is the number of tasks parallel runs so we set it to $SLURM_NTASKS
# –joblog makes parallel create a log of tasks that it has already run
# –resume makes parallel use the joblog to resume from where it has left off
# the combination of –joblog and –resume allow jobs to be resubmitted if
# necessary and continue from where they left off

parallel=”parallel –delay .2 -j $SLURM_NTASKS –joblog logs/runtask.log –resume”

# this runs the parallel command we want
# in this case, we are running a script named runGP.sh
# parallel uses ::: to separate options. Here {0..99} is a shell expansion
# so parallel will run the command passing the numbers 0 through 99
# via argument {1}

$parallel “$SRUN ./task.sh {1} > logs/parallel_{1}.log” ::: {0..99}

These options use parallel to run multiple srun simultaneously. These options contemplate the execution of sequential tasks (use of a single core) or also the execution of multicore or multithreaded tasks, adjusting the resource request of the sbatch command appropriately:

#SBATCH -n 12 #(12 tareas de 1 único core por tarea)
#SBATCH -n 12 -c4 #(12 tareas multicore con 4 cores por tarea)

The --exclusive option is necessary when executing different job steps within the same node, since it ensures the unique use of resources for each task. Another possibility is to use the --cpu_bind=none option so that the operating system is responsible.