Un correo electrónico que incluye el siguiente mensaje de advertencia: Warning health conditions currently exist. Correct these conditions before they affect array operation. Non-fatal RAIDset failure. While the RAID set is degraded, performance and availability might be decreased. There are 1 outstanding health conditions. Correct these conditions before they affect array operation.
No puede ser buena señal, al menos menciona que la condición no es fatal y podemos regresar a dormir tranquilos.
Por la mañana podemos verificar los registros del sistema y comprobamos que el error es un disco dañado, el sistema lo intento reparar y no le fue posible, por lo tanto entro uno de los discos de respaldo para reconstruir el arreglo.
Severity Date Time Member Message -------- -------- ----------- ------ --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- INFO 9/01/13 09:41:44 PM EQL3 Reconstruction of RAID LUN 0 completed in 3815 seconds. WARNING 9/01/13 09:41:44 PM EQL3 Warning health conditions currently exist. Correct these conditions before they affect array operation. More spare drives are expected. There are 1 outstanding health conditions. Correct these conditions before they affect array operation. INFO 9/01/13 09:41:44 PM EQL3 RAID set has recovered from a failure. INFO 9/01/13 08:42:26 PM EQL3 Attempt to remove drive 12 from RAID set was not successful. INFO 9/01/13 08:38:08 PM EQL3 Reconstruction of RAID LUN 0 initiated. WARNING 9/01/13 08:38:08 PM EQL3 Warning health conditions currently exist. Correct these conditions before they affect array operation. Non-fatal RAIDset failure. While the RAID set is degraded, performance and availability might be decreased. More spare drives are expected. There are 2 outstanding health conditions. Correct these conditions before they affect array operation. WARNING 9/01/13 08:38:08 PM EQL3 Failure: HDD Drive: 12, Model: XXXXXXXXXXX , Serial Number: XXXXXXXX WARNING 9/01/13 08:37:46 PM EQL3 Preemptive removal of Enclosure/Drive 0/12 has now been approved; proceeding with removal. ERROR 9/01/13 08:37:46 PM EQL3 Unable to repair bad disk sector 49660619 on disk drive 12 in RAID LUN 0. ERROR 9/01/13 08:37:46 PM EQL3 Unable to repair bad disk sector 49660616 on disk drive 12 in RAID LUN 0. ERROR 9/01/13 08:37:46 PM EQL3 Unable to repair bad disk sector 49660614 on disk drive 12 in RAID LUN 0. ERROR 9/01/13 08:37:46 PM EQL3 Unable to repair bad disk sector 49660608 on disk drive 12 in RAID LUN 0. ERROR 9/01/13 08:37:46 PM EQL3 Unable to repair bad disk sector 49660603 on disk drive 12 in RAID LUN 0. ERROR 9/01/13 08:37:46 PM EQL3 Unable to repair bad disk sector 49660601 on disk drive 12 in RAID LUN 0. ERROR 9/01/13 08:37:46 PM EQL3 Unable to repair bad disk sector 49660599 on disk drive 12 in RAID LUN 0. ERROR 9/01/13 08:37:46 PM EQL3 Unable to repair bad disk sector 49660597 on disk drive 12 in RAID LUN 0. ERROR 9/01/13 08:37:46 PM EQL3 Unable to repair bad disk sector 49660595 on disk drive 12 in RAID LUN 0. ERROR 9/01/13 08:37:46 PM EQL3 Unable to repair bad disk sector 49660593 on disk drive 12 in RAID LUN 0. ERROR 9/01/13 08:37:46 PM EQL3 Unable to repair bad disk sector 49660591 on disk drive 12 in RAID LUN 0. ERROR 9/01/13 08:37:26 PM EQL3 Unable to repair bad disk sector 49658995 on disk drive 12 in RAID LUN 0. ERROR 9/01/13 08:37:26 PM EQL3 Unable to repair bad disk sector 49658988 on disk drive 12 in RAID LUN 0. ERROR 9/01/13 08:37:26 PM EQL3 Unable to repair bad disk sector 49658986 on disk drive 12 in RAID LUN 0. ERROR 9/01/13 08:37:26 PM EQL3 Unable to repair bad disk sector 49658982 on disk drive 12 in RAID LUN 0. INFO 9/01/13 08:37:26 PM EQL3 Attempt to remove drive 12 from RAID set was not successful. WARNING 9/01/13 08:37:17 PM EQL3 Warning health conditions currently exist. Correct these conditions before they affect array operation. Non-fatal RAIDset failure. While the RAID set is degraded, performance and availability might be decreased. There are 1 outstanding health conditions. Correct these conditions before they affect array operation. ERROR 9/01/13 08:37:17 PM EQL3 Disk drive 12 failed in RAID LUN 0.
Si tienes estos equipos en garantía lo mas fácil y recomendable es hablar con el soporte técnico de DELL para que te reemplacen el disco averiado. Es requisito obligatorio ejecutar el comando diag para enviar el reporte a los ingenieros de soporte y puedan revisar los eventos ocurridos.
diag The diag command will gather configuration data from this array for support and troubleshooting purposes. No user information will be included in this data. Results will be sent to "gabrielxx@xxxxxxxxxxxx.com.mx" through e-mail. If this is unsuccessful, other options for retrieving the results will be presented at the end of the procedure. Finally, please remember to include your Dell Technical Support case or incident number in the subject line of any e-mail that you send to Dell Support. This will help ensure that the message is routed correctly. Do you wish to proceed (y/n) [y]: y Starting data collection on Thu Jan 10 14:59:06 CST 2013. Section 1 of 15: . Finished in 0 seconds Section 2 of 15: .........0.........0....... Finished in 12 seconds Section 3 of 15: .... Finished in 28 seconds Section 4 of 15: .........0...... Finished in 12 seconds Section 5 of 15: .........0....... Finished in 5 seconds Section 6 of 15: .........0.........0. Finished in 7 seconds Section 7 of 15: . Finished in 2 seconds Section 8 of 15: .... Finished in 1 seconds Section 9 of 15: .........0.........0.........0.........0.........0.........0 Finished in 4 seconds Section 10 of 15: ... Finished in 1 seconds Section 11 of 15: .........0.........0.........0.. Finished in 34 seconds Section 12 of 15: .. Finished in 2 seconds Section 13 of 15: . Finished in 4 seconds Section 14 of 15: .........0.........0.........0.........0.........0.........0.........0.........0.........0.........0.........0.........0.........0.........0.........0.........0.........0.........0.........0.... Finished in 57 seconds Section 15 of 15: . Finished in 1 seconds Sending e-mail 1 of 6. Sending e-mail 2 of 6. Sending e-mail 3 of 6. Sending e-mail 4 of 6. Sending e-mail 5 of 6. Sending e-mail 6 of 6. You have the option of retrieving the diagnostic data using FTP or SCP. To use FTP, use the 'mget' command to retrieve all files matching the specification "Seg_*.dgo". You must use the "grpadmin" account and password, and connect to one of the IP addresses from the list below. To use SCP, enter the command: 'scp -r grpadmin@x.x.x.x:. destdir' where "x.x.x.x" is one of the IP addresses from the list below. Then, in the destination location, look for files with the name "Seg_*.dgo". You can delete any other files retrieved by scp. Here are the IP addresses you can use to retrieve files from this member: 172.XXX.XXX.XXX 172.XXX.XXX.XXX 172.XXX.XXX.XXX 172.XXX.XXX.XXX 172.XXX.XXX.XXX You also have the option to capture the output by using the "text capture" feature of your Telnet or terminal emulator program. Do you wish to do this (y/n) [n]: Grupo3> logout Grupo3> Connection closed by foreign host.
En menos de 24 horas tenemos a la gente de soporte técnico en el sitio listos para reemplazar el disco dañado, vemos en los registros del sistema que lo retiran e insertan uno nuevo que sera un disco de respaldo en sustitución del anterior que entro como miembro activo del arreglo.
Severity Date Time Member Message -------- -------- ----------- ------ --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- INFO 11/01/13 10:07:35 AM EQL3 Expected number of spare drives now present. INFO 11/01/13 10:07:35 AM EQL3 Creating a RAID label for uninitialized drive 12. INFO 11/01/13 10:07:35 AM EQL3 Disk 12 is online. INFO 11/01/13 10:07:35 AM EQL3 Found and verified new drive: enclosure 0, disk 12, Model ST3600057SS , SN 6SL4NJ2J. INFO 11/01/13 10:07:13 AM EQL3 Disk 12 has been inserted. WARNING 11/01/13 10:03:33 AM EQL3 Disk 12 has been removed.
Mientras no se pongan de acuerdo todos los discos el sistema de almacenamiento estará bien.
Es muy dificil reparar un raid de discos. Sobretodo cuando la falla implica el daño de uno o mas discos de forma fisica…
Ahi solo se puede intentar recuperar los datos a traves de alguna empresa especializada en dicha tarea. Onretrieval es un laboratorio que se especializa en raid con fallas.
Saludos.
LikeLike