User Tools

Site Tools


alicebeast

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
alicebeast [2021/01/19 17:15]
florido [RAID disk failing]
alicebeast [2021/01/19 17:23]
florido [Emergency procedures]
Line 110: Line 110:
   * **swRAID**: a device managed by software RAID, which is a feature of the kernel to create RAID disks. Such feature is less reliable than an actual physical raid but it should give enough time to take action when something breaks. It includes you being able to finish a running job before shutting down the machine. ​   * **swRAID**: a device managed by software RAID, which is a feature of the kernel to create RAID disks. Such feature is less reliable than an actual physical raid but it should give enough time to take action when something breaks. It includes you being able to finish a running job before shutting down the machine. ​
  
-==== RAID disk failing ====+All commands must be run while being root (''​sudo -s''​) 
 +==== HW RAID disk failing ====
  
-This can happen if one or more of the RAID disks are broken or starting to fail. You are supposed to take action asap. The HW RAID is the big storage with lots of data. It should resist two broken disks before total failure (loss of data, unreadable data). ​+This can happen if one or more of the RAID disks are broken or starting to fail. You are supposed to take action asap.  
 + 
 +The HW RAID is the big storage with lots of data. It should resist two broken disks before total failure (loss of data, unreadable data). ​ 
 + 
 +A list of commands to interact with the RAID is here: [[it_tips:​storcliqr]]
  
 > How to detect? > How to detect?
Line 120: Line 125:
  
 > What to do > What to do
 +
   * contact me   * contact me
   * Identify the broken disk   * Identify the broken disk
   * Substitute the broken disk with a new one   * Substitute the broken disk with a new one
-  * Follow instructions about how to rebuild the array. ​The system ​does it automatically. One can check the status of the rebuild with <code bash>#​show background activity+  * The system ​should ​automatically ​rebuild the array. One can check the status of the rebuild with <code bash>#​show background activity
 storcli/​c0/​v0 show bgi</​code><​code bash># show array initialization or rebuild storcli/​c0/​v0 show bgi</​code><​code bash># show array initialization or rebuild
 storcli/​c0/​v0 show init</​code>​ storcli/​c0/​v0 show init</​code>​
Line 134: Line 140:
  
 > How to detect? > How to detect?
 +
 +  * I have a script that informs me of the status of the RAID. I will send the output to you as well after some testing
 +  * A command that can be used to check the arrays is as follows: <​code:​shell>​for i in 5 6 7; do mdadm --detail /​dev/​md12$i;​ done</​code>​
  
 > What to do > What to do
alicebeast.txt · Last modified: 2021/01/19 17:23 by florido

Accessibility Statement