This is an old revision of the document!
This page contains information on how to access the cluster and use its computing power.
Please read this section carefully.
During this testing phase, we will allow direct access to nodes of the cluster.
For security reasons, it is not easy to access the cluster now, but don't be scared! By the end of September we will find a solution to access the cluster facilities in a better way.
This test round has the following objectives:
How will that work?
How will it change AFTER the testing phase?
In short, the current setup does not maximize the computing power the cluster offers, and does not allow fair share. But allows researchers to start using the facility.
We plan to finish the node testing phase by mid September, and to start testing the batch interface by the end of September.
Thanks for your help during this testing time and we hope to have a lot of fun!
The cluster is currently composed of three elements: a gateway, a storage server and a set of 12 nodes.
For the time being, there has been no time to setup direct access to the cluster from the internet. Only two machines are allowed to access the cluster, and these are:
Every node of the cluster can access the shared storage. All users can access the shared storage, but can only access areas assigned to the working groups they belong. The shared storage is organized as follows:
|Folder name||Folder location||Folder purpose||expected filesize||Description||Subfolders|
|users|| ||User homes||files smaller than 100MB each||This folder contains each user's private home folder. In this folder one should save his own code and eventually private data. Data that can also be used by others and the single files are bigger than 100MB should not be in this folder. Use the shared folder instead.||/<username> each user her own folder|
|software|| ||Application software||files smaller than 100MB each||This folder hosts software that is not accessible via cvmfs (see later). This usually includes user/project specific libraries and frameworks.||
|shared|| ||Data that will stay for long term||Any file, especially big ones||This folder should be used for long-term stored data. For example, data needed for the whole duration of a phd project or shared among people belonging to the same research group.||
|scratch|| ||Data that will stay for short term||Any file, especially big ones||This folder should be used for short-term stored data. For example, data needed for a week long calculation or temporary calculation. This folder should be considered unreliable as its contents will be purged from time to time. The cleanup interval is yet to be decided||
|cvmfs|| ||Special folder containing CERN maintained software||user cannot write||This special folder is dedicated to software provided by CERN. This folder is read-only. Usually the content of this folder are managed via specific scripts that a user can run. If you need to add some software that you cannot find, contact the administrators.||
Three main UNIX user groups are defined, as follows:
|User group||Who belongs to it||Group hierarchy|
|npusers||Researchers belonging to Nuclear Physics||primary|
|ppusers||Researchers belonging to Particle Physics||primary|
|clusterusers||All users accessing the cluster||secondary|
Group hierarchy tells how your files are created. Whenever you create a file, its default ownership will be:
As said, currently access to nodes must happen via special machines.
A typical access routine is the following:
Let's see those in details.
<username> is the same used to login to teddi or to your own laptop.
<username> is the your username on the cluster as given by the administrators.
You will be accessing a special shell in which you'll see which node you are assigned to. Assigned nodes can also be seen in Assigned Nodes
for Particle Physicists, the username is the same used to login to teddi or to your own laptop.
<username>is the your username on the cluster as given by the administrators.
NOTE : There is no checking upon login. you can login into a node that is not assigned to you. PLEASE DON'T DO. Please check. Security enforcement can be done but is not the purpose of this testing phase. If you encounter issues, we will be able to reduce access accordingly.
Administrators provided scripts for quick setup of your work enviroment.
Just execute the command in the column Script to run at the shell prompt, or add it to your
.bash_profile file so that is executed every time you login.
The following are active now:
|Environment||Script to run||Description|
|ATLAS Experiment environment|| ||Will setup all the neeeded environment variables for ATLAS experiment, and present a selection of other environments that the user can setup.|
Suggestions on how to make your life easier when using the cluster.
One can speedup logging in by configuring her/his own ssh client. This will also help in scp-ing data to the cluster.
My suggestion for Particle Physicists is to copy this piece of code inside their own
.ssh/config file, and change it to your specific needs:
# access tjatte Host tjatte HostName tjatte.hep.lu.se User <username on tjatte> ForwardX11 yes # directly access iridium gateway Host iridiumgw User <Username on iridium> ForwardX11 yes ProxyCommand ssh -q tjatte nc iridium.lunarc.lu.se 22 # directly access node X Host nX.iridium User <Username on iridium> ForwardX11 yes ProxyCommand ssh -q iridiumgw nc nX 22 # directly access node Y Host nY.iridium User <Username on iridium> ForwardX11 yes ProxyCommand ssh -q iridiumgw nc nY 22
Example: My user is
florido. In the template above, I would change all the
<Username …> to
then to login to n12 I will do:
And I will have to input 3 passwords: one for tjatte, one for the gateway and one for the node.
If you want to access the cluster nodes from outside the division, you must go through teddi and eventually copy the above setup in your home
If you don't have an account on teddi or direct access to some other division machine, you should ask me to create one.
Where X and Y is the nodes you're allowed to run.
note that with the above you will be requested to input as many password as the number of machines in the connection. A way to ease this pain is to copy ssh keys to the nodes. Copying ssh keys to the gateway is not (yet) possible, hence you will always need two passwords: one for the ssh key and one for the gateway.
An alternative method of authenticating via ssh is by using ssh keys. It will ease the pain of writing many passwords. The only password you will need is to unlock your key.
PLEASE DO NOT USE PASSWORDLESS KEYS. IT IS A GREAT SECURITY RISK.
Read about them here:
Use screen. GNU screen is an amazing tool that opens a remote terminal that is independent on your ssh connection. If the connection drops or you accidentally close the ssh window, it will still run your jobs on the cluster.
A quick and dirty tutorial can be read here, but there's plenty more on the internet.
Please read the section Common files organization before going through this section.
Please read this carefully.
When moving data to the shared folders, please follow these common sense rules:
scratchfolder to be always there. We still have no policy for that but we will have meetings in which we decide about it.
Here's some solutions to move data to the cluster. 1-3 are generic data transfer tools. 4-5 are GRID oriented data transfer tools (mostly for Particle Physicists)
These marked with are my favourite — Florido Paganelli 2013/08/27 20:20
ubuntu-12.04.2-desktop-amd64.iso from my local machine to
scp ubuntu-12.04.2-desktop-i386.iso n12.iridium:/nfs/shared/pp/
ubuntu-12.04.2-desktop-amd64.iso from my local machine to
rsync -avz --progress ubuntu-12.04.2-desktop-amd64.iso n12.iridium:/nfs/software/pp/
More about it: https://filezilla-project.org/download.php?type=client