User Tools

Site Tools


iridium_cluster:howtos_users

Iridium Cluster: How to use

This page contains information on how to access the cluster and use its computing power.

Purpose of the testing phase starting August 22, 2013

:!: Please read this section carefully. :!:

During this testing phase, we will allow direct access to nodes of the cluster.

For security reasons, it is not easy to access the cluster now, but don't be scared! By the end of September we will find a solution to access the cluster facilities in a better way.

This test round has the following objectives:

  • Test the performance of the single nodes
  • Test the validity of a single node as a testing environment for analysis code development
  • Address any issue about missing software or functionality on each node
  • Get feedback from users to make the service better

How will that work?

  • During this test round, every researcher will be assigned a set of nodes (typically two) that he/she can use for computational tasks.
  • Any kind of drastic change (adding missing libraries or per-node software) cannot be done by the user, but will be done in cooperation with the administrators.
  • Administrators will help researchers to solve issue step-by-step.

How will it change AFTER the testing phase?

  • Users will be able to directly access in an easier way only two of the nodes for the purpose of testing their code and submitting batch jobs.
  • Batch jobs will be processed by the cluster batch interface, enabling to use the full power of all the nodes alltogether, in a fair way for all users.

In short, the current setup does not maximize the computing power the cluster offers, and does not allow fair share. But allows researchers to start using the facility.

We plan to finish the node testing phase by mid September, and to start testing the batch interface by the end of September.

Thanks for your help during this testing time and we hope to have a lot of fun!

Things one needs to know

Short summary of what the cluster is

The cluster is currently composed of three elements: a gateway, a storage server and a set of 12 nodes.

  • the gateway is used for users to access the cluster.
  • the storage server is used to maintain user home folders, software and data. See Common files organization below for details.
  • the nodes are the place where you will actually run your code. Each node has:
    • a simple name: nX, where X is the number of the node. i.e. n1 is node number 1.
    • 16 cores
    • 64GB of RAM
    • access to all folder served by the storage server. This means that a researcher will have her own home folder regardless of the node she's logging in.

For the time being, there has been no time to setup direct access to the cluster from the internet. Only two machines are allowed to access the cluster, and these are:

  • for Nuclear Physicists, alpha.nuclear.lu.se. Contact Pico to get access.
  • for Particle Physicists, tjatte.hep.lu.se. Contact Florido Paganelli to get access. Detailed instructions are shown later in this page.

Common files organization

Every node of the cluster can access the shared storage. All users can access the shared storage, but can only access areas assigned to the working groups they belong. The shared storage is organized as follows:

Folder name Folder location Folder purpose expected filesize Description Subfolders
users /nfs/users User homes files smaller than 100MB each This folder contains each user's private home folder. In this folder one should save his own code and eventually private data. Data that can also be used by others and the single files are bigger than 100MB should not be in this folder. Use the shared folder instead. /<username> each user her own folder
software /nfs/software Application software files smaller than 100MB each This folder hosts software that is not accessible via cvmfs (see later). This usually includes user/project specific libraries and frameworks. /np for Nuclear Physics users
/pp for Particle Physics users
shared /nfs/shared/ Data that will stay for long term Any file, especially big ones This folder should be used for long-term stored data. For example, data needed for the whole duration of a phd project or shared among people belonging to the same research group. /np for Nuclear Physics users
/pp for Particle Physics users
scratch /nfs/scratch/ Data that will stay for short term Any file, especially big ones This folder should be used for short-term stored data. For example, data needed for a week long calculation or temporary calculation. This folder should be considered unreliable as its contents will be purged from time to time. The cleanup interval is yet to be decided /np for Nuclear Physics users
/pp for Particle Physics users
cvmfs /cvmfs Special folder containing CERN maintained software user cannot write This special folder is dedicated to software provided by CERN. This folder is read-only. Usually the content of this folder are managed via specific scripts that a user can run. If you need to add some software that you cannot find, contact the administrators. /geant4.cern.ch for Nuclear Physics users
/atlas.cern.ch for Particle Physics users

User groups

Three main UNIX user groups are defined, as follows:

User group Who belongs to it Group hierarchy
npusers Researchers belonging to Nuclear Physics primary
ppusers Researchers belonging to Particle Physics primary
clusterusers All users accessing the cluster secondary

Group hierarchy tells how your files are created. Whenever you create a file, its default ownership will be:

  • user: your username
  • group: group you belong

Accessing Testing nodes

As said, currently access to nodes must happen via special machines.

A typical access routine is the following:

  1. Access the special machines for your division.
  2. login to the iridium access gateway
  3. login to one of the nodes you are assigned to
  4. setup the work environment

Let's see those in details.

1) Access the special machines for your division.

Particle Physics

simply run:

ssh <username>@tjatte.hep.lu.se

where <username> is the same used to login to teddi or to your own laptop.

Nuclear Physics

coming soon

2) login to the iridium access gateway

simply run:

ssh <username>@iridium.lunarc.lu.se

where <username> is the your username on the cluster as given by the administrators.

You will be accessing a special shell in which you'll see which node you are assigned to. Assigned nodes can also be seen in Assigned Nodes

:!: for Particle Physicists, the username is the same used to login to teddi or to your own laptop.

3) login to one of the nodes you are assigned to

simply run:

ssh <username>@nX

where

  • <username> is the your username on the cluster as given by the administrators.
  • X is one of the node you're assigned in Assigned Nodes

:!: NOTE :!: : There is no checking upon login. you can login into a node that is not assigned to you. PLEASE DON'T DO. Please check. Security enforcement can be done but is not the purpose of this testing phase. If you encounter issues, we will be able to reduce access accordingly.

4) setup the work environment

Administrators provided scripts for quick setup of your work enviroment. Just execute the command in the column Script to run at the shell prompt, or add it to your .bashrc or .bash_profile file so that is executed every time you login.

The following are active now:

Environment Script to run Description
ATLAS Experiment environment setupATLAS Will setup all the neeeded environment variables for ATLAS experiment, and present a selection of other environments that the user can setup.

Tips'n'Tricks

Suggestions on how to make your life easier when using the cluster.

Tips to speedup logging in

One can speedup logging in by configuring her/his own ssh client. This will also help in scp-ing data to the cluster.

Particle Physics

My suggestion for Particle Physicists is to copy this piece of code inside their own .ssh/config file, and change it to your specific needs:

# access tjatte
Host tjatte
HostName tjatte.hep.lu.se
User <username on tjatte>
ForwardX11 yes

# directly access iridium gateway
Host iridiumgw
User <Username on iridium>
ForwardX11 yes
ProxyCommand ssh -q tjatte nc iridium.lunarc.lu.se 22

# directly access node X
Host nX.iridium
User <Username on iridium>
ForwardX11 yes
ProxyCommand ssh -q iridiumgw nc nX 22

# directly access node Y
Host nY.iridium
User <Username on iridium>
ForwardX11 yes
ProxyCommand ssh -q iridiumgw nc nY 22

Example: My user is florido. In the template above, I would change all the <Username …> to florido, and nX to n12.

then to login to n12 I will do:

ssh n12.iridium

And I will have to input 3 passwords: one for tjatte, one for the gateway and one for the node.

If you want to access the cluster nodes from outside the division, you must go through teddi and eventually copy the above setup in your home .ssh folder. If you don't have an account on teddi or direct access to some other division machine, you should ask me to create one.

Where X and Y is the nodes you're allowed to run.

note that with the above you will be requested to input as many password as the number of machines in the connection. A way to ease this pain is to copy ssh keys to the nodes. Copying ssh keys to the gateway is not (yet) possible, hence you will always need two passwords: one for the ssh key and one for the gateway.

Nuclear Physics

coming soon

References:

Speedup login by using ssh keys

An alternative method of authenticating via ssh is by using ssh keys. It will ease the pain of writing many passwords. The only password you will need is to unlock your key.

:!: PLEASE DO NOT USE PASSWORDLESS KEYS. IT IS A GREAT SECURITY RISK. :!:

Read about them here:

https://wiki.archlinux.org/index.php/SSH_Keys

How not to loose all your job because you closed a ssh terminal

Use screen. GNU screen is an amazing tool that opens a remote terminal that is independent on your ssh connection. If the connection drops or you accidentally close the ssh window, it will still run your jobs on the cluster.

A quick and dirty tutorial can be read here, but there's plenty more on the internet.


Moving data to the cluster

Please read the section Common files organization before going through this section.

Rules of thumb

:!: Please read this carefully. :!:

When moving data to the shared folders, please follow these common sense rules:

  • Create folders for everything you want to share.
  • If the data has been produced by you, is nice to create a folder with your name and place everything in it.
  • If the data belongs to some specific experiment, dataset or the like, create a folder name that is consistent with that and that is easy for everybody to understand what that is about.
  • Don't overdo. Only copy data you/your colleagues need. This is a shared facility.
  • Don't remove other user's files unless you advice them and they're ok with it. This is a shared facility.
  • Don't expect contents of the scratch folder to be always there. We still have no policy for that but we will have meetings in which we decide about it.

Data transfer solutions

Here's some solutions to move data to the cluster. 1-3 are generic data transfer tools. 4-5 are GRID oriented data transfer tools (mostly for Particle Physicists)

These marked with 8-) are my favourite — Florido Paganelli 2013/08/27 20:20

Solution 1: scp,sftp,lsftp

  • Pros:
    • easy
    • only needs terminal
    • available almost everywhere
    • progress indicator
  • Cons:
    • not reliable. If connection goes down one must restart the entire transfer.
    • does not work with GRID storage
    • slow

Example:

Moving ubuntu-12.04.2-desktop-amd64.iso from my local machine to n12.iridium

  scp ubuntu-12.04.2-desktop-i386.iso n12.iridium:/nfs/shared/pp/

Solution 2: rsync

8-)

  • Pros:
    • Reliable. If connection goes down will resume from where it stopped.
    • Minimizes amount of transferred data by compressing it
    • only needs terminal
    • available on most GNU/Linux platforms
    • a bit faster
  • Cons:
    • Awkward command line
    • bad logs
    • poor progress indicator on many files
    • available on windows but needs special installation
    • does not work with GRID storage

Example:

Moving ubuntu-12.04.2-desktop-amd64.iso from my local machine to n12.iridium

  rsync -avz --progress ubuntu-12.04.2-desktop-amd64.iso n12.iridium:/nfs/software/pp/

Solution 3: FileZilla

  • Pros:
    • Reliable. Tries to resume if connection went down.
    • Visual interface
    • Available for both GNU/Linux and windows
  • Cons:
    • Visual interface :D
    • good logs
    • progress bar ^_^
    • does not work with GRID storage

More about it: https://filezilla-project.org/download.php?type=client

Solution 4: NorduGrid ARC tools (arccp, arcls, arcrm)

  • Pros:
    • works with GRID storage
  • Cons:
    • doesn't work with ATLAS datasets (yet ;-) )
    • uncommon command line interface

Example:



Solution 5: dq2 tools

  • Pros:
    • works with GRID storage
  • Cons:
    • works with ATLAS datasets
    • uncommon command line interface (but some are used to it)

Example:



Assigned Nodes

Researcher nodes
Pico n1, n7
Lene n2, n8
Oleksandr n6, n12
Anthony n3, n9
Anders n4, n10
Inga n5, n11
  • nodes n1-n6 run CentOS 6.4
  • nodes n7-n12 run Scientific Linux 6.4
iridium_cluster/howtos_users.txt · Last modified: 2013/10/15 15:57 by florido