User Tools

Site Tools


iridium_cluster:captainslog

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
Next revision Both sides next revision
iridium_cluster:captainslog [2013/11/07 13:05]
florido
iridium_cluster:captainslog [2014/04/28 17:34]
florido
Line 4: Line 4:
  
 The logs can be read top to bottom from the most recent change to the newest. The logs can be read top to bottom from the most recent change to the newest.
 +
 + --- //​[[:​Florido Paganelli]] 2014/04/28 19:33//
 +  * updated arc interface grid certificate
 +  * joined NorduGrid Sweden indexes
 +  * configured syslog installation. Needs tweaking of hostname on nodes with multiple interfaces. These nodes do not have special configuration in salt yet.
 +
 + --- //​[[:​Florido Paganelli]] 2014/03/21 21:00//
 +  * made so many changes that I couldn'​t keep track of. Now testing nodes are fully accessible from the internet.
 +  * rewrote documentation for the cluster.
 +
 + --- //​[[:​Florido Paganelli]] 2014/03/17 20:39//
 +  * added network configuration for n1 (nptest-iridium) and n2 (pptest-iridium)
 +  * rebooted nodes, now accessible from the internet. eth0 shut down awaiting for iptables config
 +  * todo: block access to other nodes, only allow course users, disable slurm, configure iptables
 +
 + --- //​[[:​Florido Paganelli]] 2014/03/14 18:00//
 +  * added sshd configuration for n1 and n2, only me and maintenance user can access
 +  * added envvars for umask
 +  * changed motd
 +
 + --- //​[[:​Florido Paganelli]] 2014/03/10 17:19//
 +  * updated salt-master on service-iridium due to incompatibilities with newer minions
 +  * created two simple queues in slurm
 +  * reconfigured all slurm nodes
 +  * reconfigured arc frontend
 +
 + --- //​[[:​Florido Paganelli]] 2014/03/07 17:23//
 +  * installed terena certificate on frontend.
 +  * configured nordugrid VOMS server
 +
 + --- //​[[:​Florido Paganelli]] 2014/02/21 10:26//
 +  * installed and configured arc. Job sumbission with emi-es a failure, but maybe ARC 4.1 will solve all.
 +  * configured arc with slurm and cache; however would be better to have two grid users (one for each division) and two queues.
 +
 + --- //​[[:​Florido Paganelli]] 2014/02/18 13:55//
 +  * installed salt on arc-iridium
 +  * installed other trustanchors on arc-iridium
 +  * configured firewall (still needs cleanup)
 +  * installed and configured munge, slurm, autofs, nfs
 +
 + --- //​[[:​Florido Paganelli]] 2014/02/17 19:07//
 +  * installed NG repos on arc-iridium and trustanchors
 +  * configured a-rex using instantca
 +  * initiated process of requesting host cert
 +  * asked Lunarc to open ports for ARC services
 +
 + --- //​[[:​Florido Paganelli]] 2014/02/11 18:29//
 +  * updated hep-srv :!: reboot didn't work, help request submitted to Lunarc.
 +  * installed epel on arc-iridium
 +  * installed ARC on arc-iridium
 +  * installed munge on arc-iridium
 +
 + --- //​[[:​Florido Paganelli]] 2014/02/07 18:31//
 +  * rebooted service-iridium
 +  * updated and rebooted ssh gateway
 +  * updated and configured arc-iridium
 +
 + --- //​[[:​Florido Paganelli]] 2014/02/04 18:31//
 +  * rebooted kvm-iridium
 +  * created arc-iridium machine
 +  * updated kvm-iridium to Centos6.5
 +  * cloning disk for arc-iridium
 +  * started service-iridium update
 +
 + --- //​[[:​Florido Paganelli]] 2014/01/31 16:02//
 +  * finalized slurm installation
 +  * fixed several iptables and munge issues
 +  * rebooted all nodes
 +
 + --- //​[[:​Florido Paganelli]] 2014/01/30 17:45//
 +
 +  * finished configuring SLURM, tests started
 +  * copied ssh keys
 +  * updated git
 +  * updated iptables configuration for all nodes
 +  * changed anaconda scripts (to be tested!) to include ssh key retrieval
 +
 + --- //​[[:​Florido Paganelli]] 2014/01/29 16:55//
 +  * installed '​SLURM'​ from FGI repositories maintained by Tiggi. This slurm is built withouth MPI support and with mysql and lua support
 +  * updated roadmap
 +
 + --- //​[[:​Florido Paganelli]] 2014/01/28 17:28//
 +  * added '​MUNGE'​ to all nodes incl. '​service-iridium'​. Not clear where the frontend should run, but I would say on the same machine that runs ARC. that means it has to share the secret key.
 +  * generated the secret key on '​service-iridium'​.
  
  --- //​[[:​Florido Paganelli]] 2013/11/07 14:03//  --- //​[[:​Florido Paganelli]] 2013/11/07 14:03//
Line 204: Line 288:
   * Mixed config for NFS3/NFS4. Would be better to use nfs4 and limit portmapper to NIS.   * Mixed config for NFS3/NFS4. Would be better to use nfs4 and limit portmapper to NIS.
   * needs stronger authorization check on portmap/​rpcbind services (i.e. hosts.allow on all machines) to be done in NIS SALT   * needs stronger authorization check on portmap/​rpcbind services (i.e. hosts.allow on all machines) to be done in NIS SALT
-  * update page [[iridium_cluster:​wantedpackagesonnodes]] with current salt configuration. +  * <del>update page [[iridium_cluster:​wantedpackagesonnodes]] with current salt configuration.</​del>​
-  * change clusterip salt configuration with jinja is such a way it only applies to n1 and n2 +
-  * clusterip might need proper ssh keys distribution on n1 and n2+
   * generation and distribution of host keys on the nodes at deployment time is needed. Might be done as Luis did for SALT.   * generation and distribution of host keys on the nodes at deployment time is needed. Might be done as Luis did for SALT.
   * change default user groups in the cluster for all users and in documentation   * change default user groups in the cluster for all users and in documentation
Line 214: Line 296:
   * sort out why dhcp won't renew after trying for long time.   * sort out why dhcp won't renew after trying for long time.
   * add CA certificates to all machines. Probably storage server to share certs would do the trick and need only one crl check in place.   * add CA certificates to all machines. Probably storage server to share certs would do the trick and need only one crl check in place.
 +  * define a proper time when node update 
 +  * disable cvmfs autoupdate 
 +  * salt-minion do not restart after upgrade. Find means to force it. salt-command might jam if done that way: <​code>​salt -v '​n5.iridium'​ cmd.run "​service salt-minion restart"</​code>​
iridium_cluster/captainslog.txt · Last modified: 2014/06/12 17:19 by florido

Accessibility Statement