Differences

This shows you the differences between two versions of the page.

--- aurora_cluster:running_on_aurora [2016/12/06 10:05]
florido [Custom software]
+++ aurora_cluster:running_on_aurora [2022/04/19 14:42] (current)
florido
@@ Line 6: / Line 6: @@
 There's only few important things one needs to know:
-  * HEP nodes are running in a special **partition** (a.k.a. queue) called ''hep''. Whenever in the documentation you're asked to specify a queue, use ''hep''
+  * A set of nodes is selected by choosing a **partition**, a **project** and a **reservation**, one for each division. The HEP nodes do not require a special reservation flag to be accessed. The partitions, project names and reservation flags are listed in the table below:
-  * A set of nodes is selected by choosing a **project** and a **reservation**, one for each division. The HEP nodes do not require a special reservation flag to be accessed. The project names and reservation flags are listed in the table below:
-^ Your division ^ SLURM Partition ^ Project String ^ Reservation String ^ call srun with ^
+^ Your division ^ SLURM Partition ^ Project String ^ Reservation String ^ call srun/sbatch with ^ Nodes ^
-| Nuclear Physics | ''hep'' | ''HEP2016-1-3'' | not needed | <code:bash>srun -p hep -A HEP2016-1-3 <scriptname></code> |
+| Nuclear Physics | ''hep'' | ''HEP2016-1-3'' | not needed | <code:bash>srun -p hep -A HEP2016-1-3 <scriptname></code> | ''au[193-216]'' |
-| Particle Physics | ''hep'' | ''HEP2016-1-4'' | not needed | <code:bash>srun -p hep -A HEP2016-1-4 <scriptname></code> |
+| Particle Physics | ''hep'' | ''HEP2016-1-4'' | not needed | <code:bash>srun -p hep -A HEP2016-1-4 <scriptname></code> | ::: |
-| Theoretical Physics | ''hep'' | ''HEP2016-1-5'' | not needed | <code:bash>srun -p hep -A HEP2016-1-5 <scriptname> </code> |
+| Theoretical Physics | ''hep'' | ''HEP2016-1-5'' | not needed | <code:bash>srun -p hep -A HEP2016-1-5 <scriptname> </code> | ::: |
-| Mathematical Physics | ''lu'' | ''lu2016-2-10'' | ''lu2016-2-10'' | <code:bash>srun -p lu -A lu2016-2-10 --reservation=lu2016-2-10 <scriptname> </code> |
+| Particle Physics - LDMX | ''lu'' | ''lu2021-2-100'' | not needed | <code:bash>srun -p lu -A ''lu2021-2-100'' <scriptname></code> | any available on the LU partition |
+| Mathematical Physics | ''lu'' | ''lu2021-2-125'' | ''lu2021-2-125'' | <code:bash>srun -p lu -A lu2021-2-125 --reservation=lu2021-2-125 <scriptname> </code> | ''mn[01-10],mn[15-20]'' |
+| Mathematical Physics, **select only skylake machines** | ''lu'' | ''lu2021-2-125'' | ''lu2021-2-125'' | <code:bash>srun -C skylake -p lu -A lu2021-2-125 --reservation=lu2021-2-125 <scriptname> </code> | ''mn15-20'' |
   * Home folders are backed up by Lunarc.
@@ Line 19: / Line 20: @@
   * HEP storage is a bit different from the others and there are a few additional rules to use it. Please read [[aurora_cluster:storage]]
-A typical direct submission to our nodes looks like:
+  * Basic usage of the batch system is described here:
+    * Using the SLURM batch system: http://lunarc-documentation.readthedocs.io/en/latest/batch_system/
+    * Batch system rules. Note: these rules might be slightly different for us since we have our own partition. http://lunarc-documentation.readthedocs.io/en/latest/batch_system_rules/
+===== Batch Scripts Examples =====
+==== hep partition (Nuclear, Particle and Theoretical Phyisics) ====
+A typical direct submission to ''hep'' nodes looks like:
 <code bash>srun -p hep -A HEP2016-1-4 myscript.sh</code>
@@ Line 34: / Line 43: @@
 </code>
-to be run with the command:
+The script is submitted  to the SLURM batch queue with the command:
 <code bash>sbatch slurmexample.sh</code>
-Basic usage of the batch system is described here:
+The results will be found in the folder where the above command is ran, in a file named after the slurm job ID.
-  * Using the SLURM batch system: http://lunarc-documentation.readthedocs.io/en/latest/batch_system/
-  * Batch system rules. Note: these rules might be slightly different for us since we have our own partition. http://lunarc-documentation.readthedocs.io/en/latest/batch_system_rules/
+==== lu partition (Mathematical Physics) ====
+A typical direct submission to ''lu'' nodes looks like:
+<code bash>srun -p lu -A lu2016-2-10 --reservation=lu2016-2-10 myscript.sh</code>
+Here is an example of a typical slurm submission script ''slurmexample.sh'' written in bash, that prints the hostname of the node where the job is executed and the PID of the bash process running the script. It will have this prologue:
+<code bash>
+#!/bin/bash
+#
+#SBATCH -A lu2016-2-10
+#SBATCH -p lu
+#SBATCH --reservation=lu2016-2-10
+#
+hostname
+srun echo $BASHPID;
+</code>
+The script is submitted  to the SLURM batch queue with the command:
+<code bash>sbatch slurmexample.sh</code>
+The results will be found in the folder where the above command is ran, in a file named after the slurm job ID.
+Since 2018/09/27 there are new nodes ''mn[15-20]'' using the //skylake// chipset/microcode. One can select just these cpus by using the -C flag.
+<code bash>sbatch -C skylake slurmexample.sh</code>
+For best performance one should recompile the code for these machines, meaning one needs to tell the compiler that skylake optimization is required. How to do this varies depending on compilers. See [[https://en.wikichip.org/wiki/intel/microarchitectures/skylake_(server)#Compiler_support]]
+For a discussion on the benefits in matrix calculus see: https://cfwebprod.sandia.gov/cfdocs/CompResearch/docs/bench2018.pdf
 ===== Interactive access to nodes for code testing =====
@@ Line 50: / Line 86: @@
 <code bash>
-interactive -t 60 -p hep -A HEP2016-1-1
+interactive -t 00:60:00 -p hep -A HEP2016-1-4
+</code>
+<code bash>
+interactive -t 00:60:00 -p lu -A lu2016-2-10 --reservation=lu2016-2-10
 </code>
-where ''-t 60'' is the time in minutes you want the interactive session to last. You can put as much as you want in the timer. Mind that whatever you're running will be killed after the specified time.
+where ''-t 00:60:00'' is the time in hours:minutes:seconds you want the interactive session to last. You can put as much as you want in the timer. Mind that whatever you're running will be killed after the specified time.
 //slurm// will select a free node for you and open a bash terminal. From that moment on you can pretty much do the same as you were doing on Iridium testing nodes.
@@ Line 99: / Line 138: @@
 ==== Singularity ====
-To be documented
+The old singularity version (1.0) is going to being removed on week 8, year 2017
+because of a security issue. Please update your scripts to enable singularity using these commands:
+<code:bash>
+module load GCC/4.9.3-2.25
+module load Singularity/2.2.1
+</code>
+It is recommended to **always use the version number** when loading the
+module to prevent issues using different versions. If you don’t use that
+and the default module changes, you will run an unwanted version.
 ===== Custom software =====

pfwiki

User Tools

Site Tools

Differences

Page Tools