This shows you the differences between two versions of the page.
Both sides previous revision Previous revision Next revision | Previous revision Next revision Both sides next revision | ||
iridium_cluster:captainslog [2013/11/11 10:35] florido [Fixmes] |
iridium_cluster:captainslog [2014/06/12 17:19] florido |
||
---|---|---|---|
Line 4: | Line 4: | ||
The logs can be read top to bottom from the most recent change to the newest. | The logs can be read top to bottom from the most recent change to the newest. | ||
+ | |||
+ | --- //[[:Florido Paganelli]] 2014/06/02 17:40// | ||
+ | * shared each node's tmp folder in '/nodestmp/<nodename>' :!: documentation about it needs to be written :!: | ||
+ | |||
+ | --- //[[:Florido Paganelli]] 2014/06/02 17:40// | ||
+ | * updated and rebooted nptest and pptest | ||
+ | * installed quantum espresso and RTE | ||
+ | * updated salt configuration: | ||
+ | * software needed by Nuclear Physics | ||
+ | * module configuration for custom modulefiles | ||
+ | * motd and banner | ||
+ | * gitted and backed up salt config | ||
+ | |||
+ | --- //[[:Florido Paganelli]] 2014/04/28 19:33// | ||
+ | * updated arc interface grid certificate | ||
+ | * joined NorduGrid Sweden indexes | ||
+ | * configured syslog installation. Needs tweaking of hostname on nodes with multiple interfaces. These nodes do not have special configuration in salt yet. | ||
+ | |||
+ | --- //[[:Florido Paganelli]] 2014/03/21 21:00// | ||
+ | * made so many changes that I couldn't keep track of. Now testing nodes are fully accessible from the internet. | ||
+ | * rewrote documentation for the cluster. | ||
+ | |||
+ | --- //[[:Florido Paganelli]] 2014/03/17 20:39// | ||
+ | * added network configuration for n1 (nptest-iridium) and n2 (pptest-iridium) | ||
+ | * rebooted nodes, now accessible from the internet. eth0 shut down awaiting for iptables config | ||
+ | * todo: block access to other nodes, only allow course users, disable slurm, configure iptables | ||
+ | |||
+ | --- //[[:Florido Paganelli]] 2014/03/14 18:00// | ||
+ | * added sshd configuration for n1 and n2, only me and maintenance user can access | ||
+ | * added envvars for umask | ||
+ | * changed motd | ||
+ | |||
+ | --- //[[:Florido Paganelli]] 2014/03/10 17:19// | ||
+ | * updated salt-master on service-iridium due to incompatibilities with newer minions | ||
+ | * created two simple queues in slurm | ||
+ | * reconfigured all slurm nodes | ||
+ | * reconfigured arc frontend | ||
+ | |||
+ | --- //[[:Florido Paganelli]] 2014/03/07 17:23// | ||
+ | * installed terena certificate on frontend. | ||
+ | * configured nordugrid VOMS server | ||
+ | |||
+ | --- //[[:Florido Paganelli]] 2014/02/21 10:26// | ||
+ | * installed and configured arc. Job sumbission with emi-es a failure, but maybe ARC 4.1 will solve all. | ||
+ | * configured arc with slurm and cache; however would be better to have two grid users (one for each division) and two queues. | ||
+ | |||
+ | --- //[[:Florido Paganelli]] 2014/02/18 13:55// | ||
+ | * installed salt on arc-iridium | ||
+ | * installed other trustanchors on arc-iridium | ||
+ | * configured firewall (still needs cleanup) | ||
+ | * installed and configured munge, slurm, autofs, nfs | ||
+ | |||
+ | --- //[[:Florido Paganelli]] 2014/02/17 19:07// | ||
+ | * installed NG repos on arc-iridium and trustanchors | ||
+ | * configured a-rex using instantca | ||
+ | * initiated process of requesting host cert | ||
+ | * asked Lunarc to open ports for ARC services | ||
+ | |||
+ | --- //[[:Florido Paganelli]] 2014/02/11 18:29// | ||
+ | * updated hep-srv :!: reboot didn't work, help request submitted to Lunarc. | ||
+ | * installed epel on arc-iridium | ||
+ | * installed ARC on arc-iridium | ||
+ | * installed munge on arc-iridium | ||
+ | |||
+ | --- //[[:Florido Paganelli]] 2014/02/07 18:31// | ||
+ | * rebooted service-iridium | ||
+ | * updated and rebooted ssh gateway | ||
+ | * updated and configured arc-iridium | ||
+ | |||
+ | --- //[[:Florido Paganelli]] 2014/02/04 18:31// | ||
+ | * rebooted kvm-iridium | ||
+ | * created arc-iridium machine | ||
+ | * updated kvm-iridium to Centos6.5 | ||
+ | * cloning disk for arc-iridium | ||
+ | * started service-iridium update | ||
+ | |||
+ | --- //[[:Florido Paganelli]] 2014/01/31 16:02// | ||
+ | * finalized slurm installation | ||
+ | * fixed several iptables and munge issues | ||
+ | * rebooted all nodes | ||
+ | |||
+ | --- //[[:Florido Paganelli]] 2014/01/30 17:45// | ||
+ | |||
+ | * finished configuring SLURM, tests started | ||
+ | * copied ssh keys | ||
+ | * updated git | ||
+ | * updated iptables configuration for all nodes | ||
+ | * changed anaconda scripts (to be tested!) to include ssh key retrieval | ||
+ | |||
+ | --- //[[:Florido Paganelli]] 2014/01/29 16:55// | ||
+ | * installed 'SLURM' from FGI repositories maintained by Tiggi. This slurm is built withouth MPI support and with mysql and lua support | ||
+ | * updated roadmap | ||
+ | |||
+ | --- //[[:Florido Paganelli]] 2014/01/28 17:28// | ||
+ | * added 'MUNGE' to all nodes incl. 'service-iridium'. Not clear where the frontend should run, but I would say on the same machine that runs ARC. that means it has to share the secret key. | ||
+ | * generated the secret key on 'service-iridium'. | ||
--- //[[:Florido Paganelli]] 2013/11/07 14:03// | --- //[[:Florido Paganelli]] 2013/11/07 14:03// | ||
Line 205: | Line 301: | ||
* needs stronger authorization check on portmap/rpcbind services (i.e. hosts.allow on all machines) to be done in NIS SALT | * needs stronger authorization check on portmap/rpcbind services (i.e. hosts.allow on all machines) to be done in NIS SALT | ||
* <del>update page [[iridium_cluster:wantedpackagesonnodes]] with current salt configuration.</del> | * <del>update page [[iridium_cluster:wantedpackagesonnodes]] with current salt configuration.</del> | ||
- | * clusterip might need proper ssh keys distribution on n1 and n2 | ||
* generation and distribution of host keys on the nodes at deployment time is needed. Might be done as Luis did for SALT. | * generation and distribution of host keys on the nodes at deployment time is needed. Might be done as Luis did for SALT. | ||
* change default user groups in the cluster for all users and in documentation | * change default user groups in the cluster for all users and in documentation |