====== Running on the Aurora Cluster ====== In general, running on Aurora follows the rules stated on Lunarc manual pages. All jobs must be ran under the //slurm// batch system. There's only few important things one needs to know: * A set of nodes is selected by choosing a **partition**, a **project** and a **reservation**, one for each division. The HEP nodes do not require a special reservation flag to be accessed. The partitions, project names and reservation flags are listed in the table below: ^ Your division ^ SLURM Partition ^ Project String ^ Reservation String ^ call srun/sbatch with ^ Nodes ^ | Nuclear Physics | ''hep'' | ''HEP2016-1-3'' | not needed | srun -p hep -A HEP2016-1-3 | ''au[193-216]'' | | Particle Physics | ''hep'' | ''HEP2016-1-4'' | not needed | srun -p hep -A HEP2016-1-4 | ::: | | Theoretical Physics | ''hep'' | ''HEP2016-1-5'' | not needed | srun -p hep -A HEP2016-1-5 | ::: | | Particle Physics - LDMX | ''lu'' | ''lu2021-2-100'' | not needed | srun -p lu -A ''lu2021-2-100'' | any available on the LU partition | | Mathematical Physics | ''lu'' | ''lu2021-2-125'' | ''lu2021-2-125'' | srun -p lu -A lu2021-2-125 --reservation=lu2021-2-125 | ''mn[01-10],mn[15-20]'' | | Mathematical Physics, **select only skylake machines** | ''lu'' | ''lu2021-2-125'' | ''lu2021-2-125'' | srun -C skylake -p lu -A lu2021-2-125 --reservation=lu2021-2-125 | ''mn15-20'' | * Home folders are backed up by Lunarc. * HEP storage is a bit different from the others and there are a few additional rules to use it. Please read [[aurora_cluster:storage]] * Basic usage of the batch system is described here: * Using the SLURM batch system: http://lunarc-documentation.readthedocs.io/en/latest/batch_system/ * Batch system rules. Note: these rules might be slightly different for us since we have our own partition. http://lunarc-documentation.readthedocs.io/en/latest/batch_system_rules/ ===== Batch Scripts Examples ===== ==== hep partition (Nuclear, Particle and Theoretical Phyisics) ==== A typical direct submission to ''hep'' nodes looks like: srun -p hep -A HEP2016-1-4 myscript.sh Here is an example of a typical slurm submission script ''slurmexample.sh'' written in bash, that prints the hostname of the node where the job is executed and the PID of the bash process running the script. It will have this prologue: #!/bin/bash # #SBATCH -A hep2016-1-4 #SBATCH -p hep # hostname srun echo $BASHPID; The script is submitted to the SLURM batch queue with the command: sbatch slurmexample.sh The results will be found in the folder where the above command is ran, in a file named after the slurm job ID. ==== lu partition (Mathematical Physics) ==== A typical direct submission to ''lu'' nodes looks like: srun -p lu -A lu2016-2-10 --reservation=lu2016-2-10 myscript.sh Here is an example of a typical slurm submission script ''slurmexample.sh'' written in bash, that prints the hostname of the node where the job is executed and the PID of the bash process running the script. It will have this prologue: #!/bin/bash # #SBATCH -A lu2016-2-10 #SBATCH -p lu #SBATCH --reservation=lu2016-2-10 # hostname srun echo $BASHPID; The script is submitted to the SLURM batch queue with the command: sbatch slurmexample.sh The results will be found in the folder where the above command is ran, in a file named after the slurm job ID. Since 2018/09/27 there are new nodes ''mn[15-20]'' using the //skylake// chipset/microcode. One can select just these cpus by using the -C flag. sbatch -C skylake slurmexample.sh For best performance one should recompile the code for these machines, meaning one needs to tell the compiler that skylake optimization is required. How to do this varies depending on compilers. See [[https://en.wikichip.org/wiki/intel/microarchitectures/skylake_(server)#Compiler_support]] For a discussion on the benefits in matrix calculus see: https://cfwebprod.sandia.gov/cfdocs/CompResearch/docs/bench2018.pdf ===== Interactive access to nodes for code testing ===== As said before, **it is not possible** to run your test code on the frontend on Aurora. But since everybody likes to test their code before submitting it to the batch system, //slurm// provides a nice way of using a node allocation as an interactive session, just like we were doing on ''pptest-iridium'' and ''nptest-iridium''. The interactive session is activated using the ''interactive'' command and a few options: interactive -t 00:60:00 -p hep -A HEP2016-1-4 interactive -t 00:60:00 -p lu -A lu2016-2-10 --reservation=lu2016-2-10 where ''-t 00:60:00'' is the time in hours:minutes:seconds you want the interactive session to last. You can put as much as you want in the timer. Mind that whatever you're running will be killed after the specified time. //slurm// will select a free node for you and open a bash terminal. From that moment on you can pretty much do the same as you were doing on Iridium testing nodes. The interactive session is terminated when you issue the ''exit'' command. ===== Loading libraries and tools on Aurora ===== Aurora allows the user to load specific versions of compilers and tools using the excellent ''module'' system. This system configures binary tools and library paths for you in a very flexible way. Read more about it in Lunarc documentation here: http://lunarc-documentation.readthedocs.io/en/latest/aurora_modules/ If you are in need of a module that is not installed, please check this list: https://github.com/hpcugent/easybuild-easyconfigs/tree/master/easybuild/easyconfigs If the software you need **is in the list but NOT in Aurora**, report to Florido, and he will coordinate with Lunarc to provide such software. If that module **does not exist in the system nor in the above list**, you will have to **build it and configure yourself**. Read more about it in the [[#Custom software]] paragraph. ===== Special software for Particle Physics users ===== These features are only available on the ''hep'' partition. Other users of such partition can use these if they want. ==== CVMFS ==== Particle Physics users might want to use CVMFS to configure their own research software. This system is now available on Aurora nodes and is the recommended way to do CERN-related analysis on the cluster. This will make your code and init scripts the same on almost every cluster who gives access to particle physicist in the whole SNIC and at any cluster at CERN, so learning how to do this on Aurora is a good experience. **cvmfs** is a networked filesystem that hosts CERN software. The mount path is ''/cvmfs/''. To initialize the environment, run the following lines of code: export ATLAS_LOCAL_ROOT_BASE="/cvmfs/atlas.cern.ch/repo/ATLASLocalRootBase" source ${ATLAS_LOCAL_ROOT_BASE}/user/atlasLocalSetup.sh :!: At the moment there are some problems in enabling cvmfs scripts to be enabled at every login. If you add the above lines to to your ''.bash_profile'' file in your home folder, interactive access to the cluster will NOT work, and maybe even submission will be broken. I don't know the reason for this and it will be investigated. It is therefore **NOT RECOMMENDED** to add those lines in your ''.bash profile''. Run them **after** you login to an interactive session or add them in the prologue of your batch job. ==== Singularity ==== The old singularity version (1.0) is going to being removed on week 8, year 2017 because of a security issue. Please update your scripts to enable singularity using these commands: module load GCC/4.9.3-2.25 module load Singularity/2.2.1 It is recommended to **always use the version number** when loading the module to prevent issues using different versions. If you don’t use that and the default module changes, you will run an unwanted version. ===== Custom software ===== Custom software can be installed in the locations listed below. It is up to the user community to develop scripts to configure the environment. Once the software is built and configurable we can consider creating our own modules to enable it. Ask Florido to help you in such development, and these modules will be shared on Aurora using the same mechanism as other software (''module spider ''). ^ Division ^ folder path ^ | Nuclear Physics | ''/projects/hep/nobackup/software/np'' | | Particle Physics | ''/projects/hep/nobackup/software/pp'' | | Theoretical Physics | ''/projects/hep/nobackup/software/tp'' | | Mathematical Phyisics | Please use your home folder for now. We are negotiating a 10GB project space. |