.. _ft3_batch_examples:

Sample Job Scripts
===================

To execute any job on FinisTerrae III, it's mandatory to specify the maximum execution time and the memory needed as explained in the `memory and time <https://cesga-docs.gitlab.io/ft3-user-guide/batch_memory.html>`_ section. 

You can copy and customize the following scripts by specifying the resources needed in your jobs or simulations. The most common parameters you can modify are: 

* ``--mem=`` or ``--mem-per-cpu=``: mandatory parameter.
* ``-t DD-HH:MM:SS`` or ``-time=DD-HH:MM:SS``: mandatory parameter, Days-Hours:Minutes:Seconds. 
* ``-N``: number of nodes requested. 
* ``-c``: number of cores requested.
* ``-C clk``: to request `clk nodes. <https://cesga-docs.gitlab.io/ft3-user-guide/other_nodes.html>`_ 
* ``-n``: total number of tasks. 
* ``--ntasks-per-node=``: number of task specifed per node. 
* ``-J``: job name.
* ``-o``: direct job standard output to output_file.
* ``-e``: direct job standard output to error_file.
* ``--mail-user=`` and ``--mail-type=``: optional parameters for `email notifications. <https://cesga-docs.gitlab.io/ft3-user-guide/batch_email_warning.html>`_ 

You can find more example scripts at FinisTerrae III directory:  /opt/cesga/job-scripts-examples-ft3. 

Using srun
----------

.. code-block::
    
    $ srun -n2 --time=00:00:10 --mem=1GB hostname
    
    Prompt output: (can be different depending on what node you are being connected)
    $ c211-15
    $ c211-15

Request two tasks (-n2) and execute the command **hostname** with
a maximum execution time of 10 seconds and requesting 1GB of RAM memory. 

This option is not recommended since it blocks the prompt until the complete execution of the job is done. We only recommend ``srun`` to check if th configuration of the parameters are right or to make sure other commands or scripts are functional. 

Using sbatch
------------

- Generate a script ``job.sh`` containing::
    
    #!/bin/bash
    #----------------------------------------------------
    # Example SLURM job script with SBATCH
    #----------------------------------------------------
    #SBATCH -J myjob            # Job name
    #SBATCH -o myjob_%j.o       # Name of stdout output file(%j expands to jobId)
    #SBATCH -e myjob_%j.e       # Name of stderr output file(%j expands to jobId)
    #SBATCH -c 8                # Cores per task requested
    #SBATCH -t 00:10:00         # Run time (hh:mm:ss) - 10 min
    #SBATCH --mem-per-cpu=3G    # Memory per core demandes (24 GB = 3GB * 8 cores)

    module load cesga/2020 
    srun hostname
    echo "done"                 # Write this message on the output file when finished

- Submit the job using ``sbatch job.sh``.

Using sbatch and GPUs
---------------------
The average NVIDIA A100 nodes have 2 GPUs per node, you can request the use of 1 or 2 GPUs with the option ``--gres=gpu:N`` where N is 1-2. There are also two new special nodes with more GPUs per node: 
* 5x NVIDIA A100: to use this node, set --gres=gpu:N where N is a value between 3-5.
* 8x NVIDIA A100: to use this node, set --gres=gpu:N where N is a value between 6-8.

You must take into account that the number of cores requested change between the amount of GPUs requested as follows:

.. warning:: 
    * cpus requested for the 2x NVIDIA A100 nodes must be 32 per GPU requested. 
    * cpus requested for the 5x NVIDIA A100 node must be 12 per GPU requested. 
    * cpus resquested for the 8X NVIDIA A100 node must be 8 per GPU requested.  

So, to request 1 GPU of the 2 available on an average A100 node::

    #SBATCH --gres=gpu:a100:1   # Request 1 GPU of 2 available on an average A100 node
    #SBATCH -c 32               # Cores per task requested

To request 2 GPUs of the 2 available on an average A100 node::
    
    #SBATCH --gres=gpu:a100:2   # Request 2 GPU of 2 available on an average A100 node
    #SBATCH -c 64               # Cores per task requested

To request 3 GPUs of the 5x A100 node::
    
    #SBATCH --gres=gpu:a100:3   # Request 3 GPU of 5 available on a 5x A100 node
    #SBATCH -c 36               # Cores per task requested

To request 4 GPUs of the 5x A100 node::
    
    #SBATCH --gres=gpu:a100:4   # Request 4 GPU of 5 available on a 5x A100 node
    #SBATCH -c 48               # Cores per task requested

To request 5 GPUs of the 5x A100 node::

    #SBATCH --gres=gpu:a100:5   # Request 5 GPU of 5 available on a 5x A100 node
    #SBATCH -c 60               # Cores per task requested

To request 6 GPUs of the 8x A100 node::

    #SBATCH --gres=gpu:a100:6   # Request 6 GPU of 8 available on a 8x A100 node
    #SBATCH -c 48               # Cores per task requested

To request 7 GPUs of the 8x A100 node::

    #SBATCH --gres=gpu:a100:7   # Request 7 GPU of 8 available on a 8x A100 node
    #SBATCH -c 56               # Cores per task requested

To request 8 GPUs of the 8x A100 node::
    
    #SBATCH --gres=gpu:a100:8   # Request 8 GPU of 8 available on a 8x A100 node
    #SBATCH -c 64               # Cores per task requested

- Generate a script ``job_GPU.sh`` containing::
    
    #!/bin/bash
    #------------------------------------------------------
    # Example SLURM job script with SBATCH requesting GPUs
    #------------------------------------------------------
    #SBATCH -J myjob            # Job name
    #SBATCH -o myjob_%j.o       # Name of stdout output file(%j expands to jobId)
    #SBATCH -e myjob_%j.e       # Name of stderr output file(%j expands to jobId)
    #SBATCH --gres=gpu:a100:1   # Request 1 GPU of 2 available on an average A100 node
    #SBATCH -c 32               # Cores per task requested
    #SBATCH -t 00:10:00         # Run time (hh:mm:ss) - 10 min
    #SBATCH --mem-per-cpu=3G    # Memory per core demandes (96 GB = 3GB * 32 cores)

    module load cesga/2020 
    srun hostname
    echo "done"                 # Write this message on the output file when finished

- Submit the job using ``sbatch job_GPU.sh``.

**You must change the number of cores as as indicated above according to the number of GPUs requested.** 

OpenMP job submission
---------------------

- Compilation (compileOpenMP.sh)::

    #!/bin/bash
    module load cesga/2020 intel
    icc -qopenmp -o omphello ./omphello.c

    #module load cesga/2020
    #gcc -fopenmp -o omphello ./omphello.c


- Generate a script ``openmp_job.sh`` containing::

    #!/bin/bash
    #----------------------------------------------------
    # Example OPENMP job script
    #----------------------------------------------------
    #SBATCH -J myjob            # Job name
    #SBATCH -o myjob_%j.o       # Name of stdout output file(%j expands to jobId)
    #SBATCH -e myjob_%j.e       # Name of stderr output file(%j expands to jobId)
    #SBATCH -c 8                # Cores per task requested
    #SBATCH -N 1                # Total # of nodes (must be 1 for OpenMP)
    #SBATCH -n 1                # Total # of mpi tasks (should be 1 for OpenMP)
    #SBATCH -t 00:10:00         # Run time (hh:mm:ss) - 10 min
    #SBATCH --mem-per-cpu=3G    # Memory per core demandes (24 GB = 3GB * 8 cores)
    
    ./omphello
    echo "done"                 # Write this message on the output file when finished

- Submit the job using ``sbatch openmp_job.sh``

MPI job submission
------------------

- Compilation (compileMPI.sh)::

    #!/bin/bash
    module load intel impi
    mpiifort -o pi ./pi3f90.f90
    #module load gcc openmpi/4.1.1_ft3
    #mpifort -o pi ./pi3f90.f90


- Generate a script ``mpi_job.sh`` containing::

    #!/bin/bash
    #----------------------------------------------------
    # Example MPI job script 
    #----------------------------------------------------
    #SBATCH -J myjob            # Job name
    #SBATCH -o myjob_%j.o       # Name of stdout output file(%j expands to jobId)
    #SBATCH -e myjob_%j.e       # Name of stderr output file(%j expands to jobId)
    #SBATCH -N 2                # Total # of nodes
    #SBATCH -c 8                # Cores per task requested
    #SBATCH -n 16               # Total # of mpi tasks
    #SBATCH -t 00:10:00         # Run time (hh:mm:ss) - 10 min
    #SBATCH --mem-per-cpu=1G    # Memory per core demandes (24 GB = 3GB * 8 cores)

    module load intel impi
    srun ./pi3
    echo "done"                 # Write this message on the output file when finnished

- Submit the job using ``sbatch mpi_job.sh``

2 nodes are requested, using 16 processes (*-n 16*), 8 processes per 
node (*--ntasks-per-node=8*) and 8 cores per process (*-c 8*, in case 
the program can use this type of hybrid parallelization), in total 128 
cores (2 nodes).

 For submit scripts demanding exclusive nodes you must add the flag ``#SBATCH --exclusive``

Hybrid MPI/OpenMP programs
--------------------------

- Compilation (compileMPIOpenMP.sh)::

    #!/bin/bash
    
    #FORTRAN
    #INTEL
    module load cesga/2020 intel impi
    icc -c help_fortran_find_core_id.c
    mpiifort -fopenmp -o hybrid ./hybrid.f90 help_fortran_find_core_id.o
    #GNU
    #module load cesga/2020 gcc openmpi/4.0.5_ft3
    #gcc -c help_fortran_find_core_id.c
    #mpif90 -fopenmp -ffree-line-length-256 -o hybrid ./hybrid.f90 help_fortran_find_core_id.o

    #C
    #module load cesga/2020 intel impi
    #mpiicc -fopenmp -o hybrid ./hybrid.c
    #GNU
    #module load cesga/2020 gcc openmpi
    #mpicc -fopenmp o hybrid ./hybrid.f90

- Generate a script in Fortran ``hybrid.f90`` containing::

    !* ************************************************************************** *!
    !*                                                                            *!
    !* Hybrid MPI+OpenMP "Hello world!" program (Fortran source code).            *!
    !*                                                                            *!
    !* - Reports core_id and node_name for all MPI processes and OpenMP threads.  *!
    !* - It does not use conditional compilation (for brevity and readability).   *!
    !* - Needed:   help_fortran_find_core_id.c and *.o (icc -c *.c)               *!
    !*                                                                            *!
    !* - Course material: Introduction to Hybrid Programming in HPC               *!
    !*                                                                            *!
    !*                    It is made freely available with the understanding that *!
    !*                    every copy must include this header and that            *!
    !*                    the authors as well as VSC and TU Wien                  *!
    !*                    take no responsibility for the use of this program.     *!
    !*                                                                            *!
    !*        (c) 01/2019 Claudia Blaas-Schenner (VSC Team, TU Wien)              *!
    !*                    claudia.blaas-schenner@tuwien.ac.at                     *!
    !*                                                                            *!
    !*      vsc3:  module load intel/18 intel-mpi/2018                            *!
    !*      vsc3:  mpiifort -qopenmp -o he-hy he-hy.f90 help_fortran_find_core_id.o
    !*      vsc3:  export MPI_PROCESSES=4   [1-16 on one default node (16 cores)] *!
    !*      vsc3:  export OMP_NUM_THREADS=4 [1-16 on one default node (16 cores)] *!
    !*      vsc3:  export KMP_AFFINITY=granularity=thread,compact,1,0             *!
    !*      vsc3:  export I_MPI_PIN_DOMAIN=`expr 2 \* $OMP_NUM_THREADS`    [h.t.] *!
    !*      vsc3:  mpirun -n $MPI_PROCESSES ./he-hy | sort -n | cut -c 1-78       *!
    !*                                                                            *!
    !* ************************************************************************** *!

    program main
    use mpi_f08                                         ! MPI header/module
    use omp_lib                                         ! OpenMP header/module
    implicit none

    integer ierror                                      ! OPTIONAL with mpi_f08
    integer rank, size                                  ! MPI
    integer thread_id, num_threads                      ! OpenMP
    integer provided                                    ! MPI+OpenMP
    integer core_id                                     ! ... core_id
    integer, external :: find_core_id                   ! ... core_id (external)
    integer namelen                                     ! ... MPI processor_name
    character(len=MPI_MAX_PROCESSOR_NAME) :: name       ! ... MPI processor_name

    rank = 0                                            ! MPI - initialized
    size = 1                                            ! MPI - initialized
    thread_id = 0                                       ! OpenMP - initialized
    num_threads = 1                                     ! OpenMP - initialized
    provided = 0                                        ! MPI+OpenMP - initialized

    call MPI_Init_thread(MPI_THREAD_FUNNELED, provided, ierror)     ! MPI+OpenMP

    call MPI_Comm_rank(MPI_COMM_WORLD, rank, ierror)    ! MPI rank
    call MPI_Comm_size(MPI_COMM_WORLD, size, ierror)    ! MPI size
    call MPI_Get_processor_name(name, namelen, ierror)  ! MPI processor_name

    !$omp parallel private(thread_id,num_threads,core_id)      ! OpenMP parallel

    thread_id = omp_get_thread_num()                    ! OpenMP thread_id
    num_threads = omp_get_num_threads()                 ! OpenMP num_threads
    core_id = find_core_id()                            ! ... core_id

    if (rank .eq. 0 .and. thread_id .eq. 0) then
    write(*,"('a: he-hy = Hybrid MPI+OpenMP program that reports core_id and node_name (c) cb')")
    write(*,"('b: all levels of MPI_THREAD_*: SINGLE=',i1,', FUNNELED=',i1,', SERIALIZED=',i1,', MULTIPLE=',i1)") MPI_THREAD_SINGLE, MPI_THREAD_FUNNELED, MPI_THREAD_SERIALIZED, MPI_THREAD_MULTIPLE
    write(*,"('c: level of thread support required = ',i1,' and provided = ',i1)") MPI_THREAD_FUNNELED, provided
    write(*,"('d: Hello world! -Running with ',i4,' MPI processes each with ',i4,' OpenMP threads')") size, num_threads
    endif

    write(*,"('MPI process ',i4,' / ',i4,' OpenMP thread ',i4,' / ',i4,' ON core ',i4,' of node ',a15)") rank, size, thread_id, num_threads, core_id, name

    !$omp end parallel                                  ! OpenMP end parallel
    call sleep(60)
    call MPI_Finalize(ierror)                           ! MPI finalization
    end program

- Generate a script in Fortran ``hybrid.c`` containing::

    /* ************************************************************************** */
    /*                                                                            */
    /* Hybrid MPI+OpenMP "Hello world!" program (C source code).                  */
    /*                                                                            */
    /* - Reports core_id and node_name for all MPI processes and OpenMP threads.  */
    /* - It does not use conditional compilation (for brevity and readability).   */
    /*                                                                            */
    /* - Course material: Introduction to Hybrid Programming in HPC               */
    /*                                                                            */
    /*                    It is made freely available with the understanding that */
    /*                    every copy must include this header and that            */
    /*                    the authors as well as VSC and TU Wien                  */
    /*                    take no responsibility for the use of this program.     */
    /*                                                                            */
    /*        (c) 01/2019 Claudia Blaas-Schenner (VSC Team, TU Wien)              */
    /*                    claudia.blaas-schenner@tuwien.ac.at                     */
    /*                                                                            */
    /*      vsc3:  module load intel/18 intel-mpi/2018                            */
    /*      vsc3:  mpiicc -qopenmp -o he-hy he-hy.c                               */
    /*      vsc3:  export MPI_PROCESSES=4   [1-16 on one default node (16 cores)] */
    /*      vsc3:  export OMP_NUM_THREADS=4 [1-16 on one default node (16 cores)] */
    /*      vsc3:  export KMP_AFFINITY=granularity=thread,compact,1,0             */
    /*      vsc3:  export I_MPI_PIN_DOMAIN=`expr 2 \* $OMP_NUM_THREADS`    [h.t.] */
    /*      vsc3:  mpirun -n $MPI_PROCESSES ./he-hy | sort -n | cut -c 1-78       */
    /*                                                                            */
    /* ************************************************************************** */

    /* #include <sched.h> */                            /* ... sched_getcpu() */
    #include <mpi.h>                                    /* MPI header */
    #include <omp.h>                                    /* OpenMP header */
    #include <stdio.h>

    int main(int argc, char *argv[])
    {

    int rank = 0,      size = 1;                        /* MPI - initialized */
    int thread_id = 0, num_threads = 1;                 /* OpenMP - initialized */
    int provided = 0;                                   /* MPI+OpenMP */
    int core_id;                                        /* ... core_id */
    int namelen;                                        /* ... MPI processor_name */
    char name[MPI_MAX_PROCESSOR_NAME];                  /* ... MPI processor_name */

    MPI_Init_thread(&argc, &argv, MPI_THREAD_FUNNELED, &provided);  /* MPI+OpenMP */

    MPI_Comm_rank(MPI_COMM_WORLD, &rank);               /* MPI rank */
    MPI_Comm_size(MPI_COMM_WORLD, &size);               /* MPI size */
    MPI_Get_processor_name(name, &namelen);             /* ... MPI processor_name */

    #pragma omp parallel private(thread_id,num_threads,core_id) /* OpenMP parallel*/
    {
    thread_id = omp_get_thread_num();                   /* OpenMP thread_id */
    num_threads = omp_get_num_threads();                /* OpenMP num_threads */
    core_id = sched_getcpu();                           /* ... core_id */

    if (rank == 0 && thread_id == 0)
    {
    printf ("a: he-hy = Hybrid MPI+OpenMP program that reports core_id and node_name (c) cb\n");
    printf ("b: all levels of MPI_THREAD_*: SINGLE=%d, FUNNELED=%d, SERIALIZED=%d, MULTIPLE=%d\n", MPI_THREAD_SINGLE, MPI_THREAD_FUNNELED, MPI_THREAD_SERIALIZED, MPI_THREAD_MULTIPLE);
    printf ("c: level of thread support required = %d and provided = %d\n", MPI_THREAD_FUNNELED, provided);
    printf ("d: Hello world! -Running with %4i MPI processes each with %4i OpenMP threads\n", size, num_threads);
    }

    printf ("MPI process %4i / %4i OpenMP thread %4i / %4i ON core %4i of node %s\n", rank, size, thread_id, num_threads, core_id, name);

    }                                                   /* OpenMP parallel end */

    MPI_Finalize();                                     /* MPI finalization */

    }


- Generate a script for submit the job in exclusive nodes containing::

    #!/bin/bash
    #SBATCH -J MPIOpenMP_Job_on_exclusive_nodes -o %x-%J.out
    #SBATCH -t 00:20:00 -n 4 -c 8 --ntasks-per-node=2 --mem=256G
    #SBATCH --exclusive -m cyclic:cyclic:fcyclic
    module load cesga/2020 intel impi
    #module load cesga/2020 gcc openmpi
    #cgroups monitoring
    #/opt/cesga/job-scripts-examples-ft3/monitoring/cgroups_info.sh &

    export OMP_NUM_THREADS=$SLURM_CPUS_PER_TASK
    srun --cpu-bind=cores -c $SLURM_CPUS_PER_TASK --cpu_bind=verbose ./hybrid

- Generate a script for submit the job in shared nodes containing::

    #!/bin/bash
    #SBATCH -J MPIOpenMP_Job_on_shared_nodes -o %x-%J.out
    #SBATCH -t 00:20:00
    #SBATCH -n 8 --ntasks-per-node=4 -c 8 --mem-per-cpu=3G
    module load cesga/2020 intel impi
    #cgroups monitoring
    /opt/cesga/job-scripts-examples/monitoring/cgroups_info.sh &

    echo "DATE: $(date '+%s') "
    srun --cpu_bind=verbose ./hybrid

Submit any of the scripts using ``sbatch MPIOpenMP_Job_on_shared_nodes.sh`` or ``sbatch MPIOpenMP_Job_on_exclusive_nodes.sh``

.. Note::
    You can access to these scripts at FInisTerrae III directory: /opt/cesga/job-scripts-examples-ft3