Iridium cluster Work In Progress
Roadmap
new schedule set September 9th, 2015
1) Reconfigure service disks on frontend with a more flexible technology, e.g. lvm.
2) Reinstall service machines with lxc instead of kvm/virtualization,
3) Connect to the newly bought storage at Lunarc
4) Connect to the default storage DDN at Lunarc
5) Configure the new nodes to integrate into Lunarc
6) Configure access to be integrated into Lunarc
DONE (as 9th Sep 2015)
1) understand how to interact with existing grid storage present at Lunarc. Update: Lunarc suggested to benchmark current connection. If not enough, interact with Jens to understand how direct connection can be achieved. Lunarc only has data, no metadata or index.
4) Start batch/grid jobs tests with users from particle physics and nuclear physics. In this phase we will be able to see how resource management should be done for an optimal use of the cluster. Missing: ATLAS RTEs.
5) Setup n1 and n2 as test nodes. This includes:
Configure direct connection of nodes to the internet DONE
LDAP+OTP authentication on nodes. Status: sent domain name to lunarc. Received first bit of info from Lunarc. NOT DONE: impractical for use, there was no time to set it up. To be scheduled for next iteration and new Iridium nodes.
DONE (as 17th Mar 2014)
DONE (as 31st Jan 2014)
DONE (as 29th Jan 2014)
DONE (as 15th Oct 2013)
DONE (as 15th Sept 2013)
DONE (as 20th August 2013)
1) setup a direct ssh access to at least one of the computing nodes
2) configure storage server with minimal services for users (e.g. home folders)
3) install application software for researchers to run test jobs. This will include the use of CERNVM and needs some coordination with Lunarc.
independent from Lunarc. Meeting suggested to find alternative solutions. Luis set up of SALT.
Luis prepared automation of installation on all nodes.
cvmfs installation on one node succesful. installation automation is ongoing task.
Tech documents
Description of the cluster. Some of these documents might have restricted access. Contact me if
you need to access those.
Captain's log
All the work that has been done, day by day.
Moved to another page because it was to big: captainslog
Issues
Useful links
-
-
-
-
-
-

Add links to xfs documentation!!!
-
-
-
-
-
-
-
-
-
-
-
SQUID setup:
Modules: a nice way of configuring cluster environments
SLURM: a batch system.
MUNGE: a way of authenticating across nodes, needed by SLURM
Howtos
-
other
howtos (admin access only)
Various other stuff
Wanted packages on nodes