Here I am keeping trace of the work I am doing on the cluster, for everybody to track progress.
The logs can be read top to bottom from the most recent change to the newest.
— Florido Paganelli 2014/06/12 19:19
shared each node's tmp folder in
/nodestmp/<nodename>

documentation about it needs to be written

— Florido Paganelli 2014/06/02 17:40
updated and rebooted nptest and pptest
installed quantum espresso and RTE
updated salt configuration:
gitted and backed up salt config
— Florido Paganelli 2014/04/28 19:33
updated arc interface grid certificate
joined NorduGrid Sweden indexes
configured syslog installation. Needs tweaking of hostname on nodes with multiple interfaces. These nodes do not have special configuration in salt yet.
— Florido Paganelli 2014/03/21 21:00
— Florido Paganelli 2014/03/17 20:39
added network configuration for n1 (nptest-iridium) and n2 (pptest-iridium)
rebooted nodes, now accessible from the internet. eth0 shut down awaiting for iptables config
todo: block access to other nodes, only allow course users, disable slurm, configure iptables
— Florido Paganelli 2014/03/14 18:00
— Florido Paganelli 2014/03/10 17:19
updated salt-master on service-iridium due to incompatibilities with newer minions
created two simple queues in slurm
reconfigured all slurm nodes
reconfigured arc frontend
— Florido Paganelli 2014/03/07 17:23
— Florido Paganelli 2014/02/21 10:26
installed and configured arc. Job sumbission with emi-es a failure, but maybe ARC 4.1 will solve all.
configured arc with slurm and cache; however would be better to have two grid users (one for each division) and two queues.
— Florido Paganelli 2014/02/18 13:55
installed salt on arc-iridium
installed other trustanchors on arc-iridium
configured firewall (still needs cleanup)
installed and configured munge, slurm, autofs, nfs
— Florido Paganelli 2014/02/17 19:07
installed NG repos on arc-iridium and trustanchors
configured a-rex using instantca
initiated process of requesting host cert
asked Lunarc to open ports for ARC services
— Florido Paganelli 2014/02/11 18:29
updated hep-srv

reboot didn't work, help request submitted to Lunarc.
installed epel on arc-iridium
installed ARC on arc-iridium
installed munge on arc-iridium
— Florido Paganelli 2014/02/07 18:31
— Florido Paganelli 2014/02/04 18:31
rebooted kvm-iridium
created arc-iridium machine
updated kvm-iridium to Centos6.5
cloning disk for arc-iridium
started service-iridium update
— Florido Paganelli 2014/01/31 16:02
— Florido Paganelli 2014/01/30 17:45
finished configuring SLURM, tests started
copied ssh keys
updated git
updated iptables configuration for all nodes
changed anaconda scripts (to be tested!) to include ssh key retrieval
— Florido Paganelli 2014/01/29 16:55
— Florido Paganelli 2014/01/28 17:28
added 'MUNGE' to all nodes incl. 'service-iridium'. Not clear where the frontend should run, but I would say on the same machine that runs ARC. that means it has to share the secret key.
generated the secret key on 'service-iridium'.
— Florido Paganelli 2013/11/07 14:03
applied package adds to all nodes
updated software package for HEP in salt/common
updated salt config to create directories in /nfs, applied to all nodes
restarted salt-minion on all nodes
— Florido Paganelli 2013/11/06 17:42
— Florido Paganelli 2013/09/20 18:12
— Florido Paganelli 2013/09/20 15:19
— Florido Paganelli 2013/08/22 12:40
— Florido Paganelli 2013/08/21 21:48
configured gateway restricted shell
added new users to the cluster
solved issue with password change. Passwords cannot be changed by users now, :TODO: solve this security issue using PAM on all machines.
added missing users to tjatte for testing round
-
— Florido Paganelli 2013/08/20 17:48
morning meeting with Pico to sort out some technicalities. Decisions: change the way direct login is done. Have a limited shell gateway.
Afternoon meeting with e-Science group: set new roadmap
Redesigned cluster web pages
— Florido Paganelli 2013/08/19 21:14
— Florido Paganelli 2013/08/16 18:58
configured clusterip mode for load balancing between testing nodes n1 and n2
added SALT configuration for clusterip (needs more work, awaiting for Luis)
changed atlas.sh
configured kvm-iridium to forward to CLUSTERIP
automated salt-call execution at boot time for each node. Node configures itself and reboots at installation time.
tested cvmfs on Centos6, seems to work
reinstalled n1 with cvmfs
run some tests with cvmfs runKV for missing libraries etc. on n1. Dump is in pflorido user folder.
— Florido Paganelli 2013/08/14 20:23
— Florido Paganelli 2013/08/13 18:41
nfs configuration changed to nfs4 for SALT keys distribution
sl6 installation automation script set up.
decisions on how to move Nuclear Physics data into the cluster have been taken: access to a single node of the cluster enabled to allow Luis to start tests. Open access to other researchers requires Lunarc intervention.
-
— Florido Paganelli 2013/08/12 18:27
finalized nfs4 configuration on storage.
created autofs salt configuration.
created NIS salt configuration
versioned salt folder with git.
started cvmfs salt automatic configuration
changed partitioning scheme on sl6 nodes to accomodate cvmfs. This triggered creation of a sl6 kickstart file.
testing sl6 kickstart file
— Florido Paganelli 2013/08/09 18:00
reconfiguring NFS shares to be compliant with nfs4.
meeting w Luis on SALT operations
preliminary discussion on dataset and software deployment.
— Florido Paganelli 2013/08/07 17:31
added SALT iptables configuration on service-iridium
created auto.master and auto.home NIS maps for automatic configuration of mounts. This is probably better done with salt with proper auto.home configuration on each node.
changed auto.master to include NIS auto.home to n2 for automatic configuration of mounts.
-
writing documentation on how to add new users. Discovered an issue with autofs setup.
— Florido Paganelli 2013/08/06 18:45
better NFS configuration on storage-iridium.
meeting with Luis on ssh host key sharing
configuration of storage-iridium and NFS for secure key sharing
— Florido Paganelli 2013/08/05 19:10
— Florido Paganelli 2013/08/02 18:20
cloned sdb1 and sdb3 from kvm-iridium virtimages pool with dd to storage-iridium /export/backupimages/. Attempt to use 'virsh vol-download' failed: it took one day to transfer 2GB. Let's not do that anymore
removed sdb4 partition on kvm-iridium virtimages(can't remember why was there)
restarted sshgateway(iridium) and service-iridium
configured kickstart to install SL6
installed SL6 on n2
set up a specific partition for cernvmfs
installed cernvmfs
fixed an issue with service-iridium hosts file, had wrong IP address
— Florido Paganelli 2013/08/01 15:16
enabled Luis on hep-monitor
issues with virtualization layer, storage management. Hypervisor machine updated and restarted.
— Florido Paganelli 2013/07/31 16:30
— Florido Paganelli 2013/07/31 11:35
reconfigured all machines to use second network card for boot
started kickstart CentOS6 installation on n2,n3,n4
— Florido Paganelli 2013/07/29 16:52
created a user and a vnc server instance for Luis
reconfigured hep-srv to be able to resolve internal hostnames
started experiencing with kickstart
— Florido Paganelli 2013/07/26 17:08
finished configuring named
DNS on service-iridium. Now nodes can find other machines.
issue with hep-srv: probably broken network config. Sent email to Rickard and Robert from Lunarc.
configured n1 to correctly join the domain
port-forwarded one of the nodes (n1)
— Florido Paganelli 2013/07/25 18:24
— Florido Paganelli 2013/07/24 18:29
installed dhpc server on service-iridium
DONE: nodes BIOS setup must be changed, the default ethernet boot is not the configured one.
DONE: ethernet addresses needs to be updated in dhcp server.
DONE: issues in dispatching dns must be solved. Maybe installing bind or dnsmasq.
installed PXE booting system on service-iridium
installed tftp server on service-iridium. Issues with selinux.
changed storage-iridium iptables to serve nfs folders
created a directory for boot images on storage-iridium
successfully booted a node for installation. A mirror of Centos6 is needed to complete the install via nfs
— Florido Paganelli 2013/07/23 18:07
created xfs filesystem on 30TB storage
created directories to be shared among nodes
set up quotas as discussed with Luis. A defined description of quotas must be added to the cluster description document.
— Florido Paganelli 2013/07/22 18:39
profiling storage usage after meeting w Luis
understanding xfs features
understanding logical volume management basics. LVM2 will be used.
created logical volumes on 30GB storage
quotas will be managed by xfs on folders. To be done.
— Florido Paganelli 2013/07/10 16:39
— Florido Paganelli 2013/07/08 16:34
— Florido Paganelli 2013/07/05 17:53
configured service machine service-iridium
configured iptables and NAT routing on sshgateway
configured NIS server on service-iridium
configured NIS client on sshgateway
configured iptables for NIS on service-iridium
— Florido Paganelli 2013/07/02 13:54
Setup wiki to keep track of progress
Created a machine sshgateway iridium.lunarc.lu.se to be used as main ssh gateway and to host some of the services.
Configured frontend kvm-iridium machine networking to be ready for hosting.