We are in the process of migrating this forum. A new space will be available soon. We are sorry for the inconvenience.

XenServer en RAID 5 y disco(s) cascado(s)


Siliconworld
25/08/2013, 20:09
Tengo curiosidad, de que año/gama es tu KS con 4 discos duros de 1,5 Tb?

Gracias, Un saludo.

apocalipsis
25/08/2013, 19:42
Creo que has perdido el RAID... si mal no interpreto te han fallado 2 discos de los 4


Aug 22 21:48:04 ks308565 kernel: raid5: not enough operational devices for md2 (2/4 failed)
Aug 22 21:48:04 ks308565 kernel: RAID5 conf printout:
Aug 22 21:48:04 ks308565 kernel: --- rd:4 wd:2
Aug 22 21:48:04 ks308565 kernel: disk 1, o:1, dev:sdb2
Aug 22 21:48:04 ks308565 kernel: disk 2, o:1, dev:sdc2
Aug 22 21:48:04 ks308565 kernel: raid5: failed to run raid set md2

waitatv
23/08/2013, 22:17
Hola!

Tenemos un problema y no sabemos exactamente cómo proceder...

El servidor es un Kemsirve con 4 discos de 1.5 TB en RAID 5, y hace un par de días petaron las VM con un Kernel Panic. Al tratar de reiniciar el host, ya no podía conectarse al "Local storage"

Según he podido deducir, el problema está en que no puede montar el /md2 y no encuentra el Local storage del XenServer

En el log veo lo siguiente:

Código:
Aug 22 21:48:04 ks308565 kernel: md: Autodetecting RAID arrays.
Aug 22 21:48:04 ks308565 kernel: md: invalid superblock checksum on sda2
Aug 22 21:48:04 ks308565 kernel: md: sda2 does not have a valid v0.90 superblock, not importing!
Aug 22 21:48:04 ks308565 kernel: md: invalid raid superblock magic on sda3
Aug 22 21:48:04 ks308565 kernel: md: sda3 does not have a valid v0.90 superblock, not importing!
Aug 22 21:48:04 ks308565 kernel: md: invalid raid superblock magic on sdb3
Aug 22 21:48:04 ks308565 kernel: md: sdb3 does not have a valid v0.90 superblock, not importing!
Aug 22 21:48:04 ks308565 kernel: md: invalid raid superblock magic on sdc3
Aug 22 21:48:04 ks308565 kernel: md: sdc3 does not have a valid v0.90 superblock, not importing!
Aug 22 21:48:04 ks308565 kernel: md: invalid raid superblock magic on sdd3
Aug 22 21:48:04 ks308565 kernel: md: sdd3 does not have a valid v0.90 superblock, not importing!
Aug 22 21:48:04 ks308565 kernel: md: Scanned 12 and added 7 devices.
Aug 22 21:48:04 ks308565 kernel: md: autorun ...
Aug 22 21:48:04 ks308565 kernel: md: considering sdd2 ...
Aug 22 21:48:04 ks308565 kernel: md:  adding sdd2 ...
Aug 22 21:48:04 ks308565 kernel: md: sdd1 has different UUID to sdd2
Aug 22 21:48:04 ks308565 kernel: md:  adding sdc2 ...
Aug 22 21:48:04 ks308565 kernel: md: sdc1 has different UUID to sdd2
Aug 22 21:48:04 ks308565 kernel: md:  adding sdb2 ...
Aug 22 21:48:04 ks308565 kernel: md: sdb1 has different UUID to sdd2
Aug 22 21:48:04 ks308565 kernel: md: sda1 has different UUID to sdd2
Aug 22 21:48:04 ks308565 kernel: md: created md2
Aug 22 21:48:04 ks308565 kernel: md: bind
Aug 22 21:48:04 ks308565 kernel: md: bind
Aug 22 21:48:04 ks308565 kernel: md: bind
Aug 22 21:48:04 ks308565 kernel: md: running: 
Aug 22 21:48:04 ks308565 kernel: md: kicking non-fresh sdd2 from array!
Aug 22 21:48:04 ks308565 kernel: md: unbind
Aug 22 21:48:04 ks308565 kernel: md: export_rdev(sdd2)
Aug 22 21:48:04 ks308565 kernel: raid5: device sdc2 operational as raid disk 2
Aug 22 21:48:04 ks308565 kernel: raid5: device sdb2 operational as raid disk 1
Aug 22 21:48:04 ks308565 kernel: raid5: not enough operational devices for md2 (2/4 failed)
Aug 22 21:48:04 ks308565 kernel: RAID5 conf printout:
Aug 22 21:48:04 ks308565 kernel:  --- rd:4 wd:2
Aug 22 21:48:04 ks308565 kernel:  disk 1, o:1, dev:sdb2
Aug 22 21:48:04 ks308565 kernel:  disk 2, o:1, dev:sdc2
Aug 22 21:48:04 ks308565 kernel: raid5: failed to run raid set md2
Tras mucho investigar, me queda alguna duda:

¿Qué disco duro ha fallado, exactamente?
¿Si pido el cambio de disco, podré activar de nuevo el /md2 y recuperar los datos con los demás?
¿No sé si estoy diciendo una estupidez, pero... puedo desactivar el disco que ha cascado y arrancarlo todo igualmente?

Pego el resultado de algunos comandos:

Código:
[root@ksxxxxxx ~]# ls -l /dev/disk/by-id
total 0
lrwxrwxrwx 1 root root  9 Aug 22 21:47 edd-int13_dev80 -> ../../sda
lrwxrwxrwx 1 root root 10 Aug 22 21:47 edd-int13_dev80-part1 -> ../../sda1
lrwxrwxrwx 1 root root 10 Aug 22 21:47 edd-int13_dev80-part2 -> ../../sda2
lrwxrwxrwx 1 root root 10 Aug 22 21:48 edd-int13_dev80-part3 -> ../../sda3
lrwxrwxrwx 1 root root  9 Aug 22 21:47 edd-int13_dev81 -> ../../sdb
lrwxrwxrwx 1 root root 10 Aug 22 21:47 edd-int13_dev81-part1 -> ../../sdb1
lrwxrwxrwx 1 root root 10 Aug 22 21:48 edd-int13_dev81-part2 -> ../../sdb2
lrwxrwxrwx 1 root root 10 Aug 22 21:48 edd-int13_dev81-part3 -> ../../sdb3
lrwxrwxrwx 1 root root  9 Aug 22 21:47 edd-int13_dev82 -> ../../sdc
lrwxrwxrwx 1 root root 10 Aug 22 21:47 edd-int13_dev82-part1 -> ../../sdc1
lrwxrwxrwx 1 root root 10 Aug 22 21:47 edd-int13_dev82-part2 -> ../../sdc2
lrwxrwxrwx 1 root root 10 Aug 22 21:47 edd-int13_dev82-part3 -> ../../sdc3
lrwxrwxrwx 1 root root  9 Aug 22 21:47 edd-int13_dev83 -> ../../sdd
lrwxrwxrwx 1 root root 10 Aug 22 21:47 edd-int13_dev83-part1 -> ../../sdd1
lrwxrwxrwx 1 root root 10 Aug 22 21:47 edd-int13_dev83-part2 -> ../../sdd2
lrwxrwxrwx 1 root root 10 Aug 22 21:47 edd-int13_dev83-part3 -> ../../sdd3
lrwxrwxrwx 1 root root  9 Aug 22 21:47 scsi-SATA_ST31500341AS_9VS2NTT0 -> ../../sdb
lrwxrwxrwx 1 root root 10 Aug 22 21:47 scsi-SATA_ST31500341AS_9VS2NTT0-part1 -> ../../sdb1
lrwxrwxrwx 1 root root 10 Aug 22 21:48 scsi-SATA_ST31500341AS_9VS2NTT0-part2 -> ../../sdb2
lrwxrwxrwx 1 root root 10 Aug 22 21:48 scsi-SATA_ST31500341AS_9VS2NTT0-part3 -> ../../sdb3
lrwxrwxrwx 1 root root  9 Aug 22 21:47 scsi-SATA_ST31500341AS_9VS2NVAR -> ../../sdc
lrwxrwxrwx 1 root root 10 Aug 22 21:47 scsi-SATA_ST31500341AS_9VS2NVAR-part1 -> ../../sdc1
lrwxrwxrwx 1 root root 10 Aug 22 21:47 scsi-SATA_ST31500341AS_9VS2NVAR-part2 -> ../../sdc2
lrwxrwxrwx 1 root root 10 Aug 22 21:47 scsi-SATA_ST31500341AS_9VS2NVAR-part3 -> ../../sdc3
lrwxrwxrwx 1 root root  9 Aug 22 21:47 scsi-SATA_ST31500341AS_9VS2PFYC -> ../../sdd
lrwxrwxrwx 1 root root 10 Aug 22 21:47 scsi-SATA_ST31500341AS_9VS2PFYC-part1 -> ../../sdd1
lrwxrwxrwx 1 root root 10 Aug 22 21:47 scsi-SATA_ST31500341AS_9VS2PFYC-part2 -> ../../sdd2
lrwxrwxrwx 1 root root 10 Aug 22 21:47 scsi-SATA_ST31500341AS_9VS2PFYC-part3 -> ../../sdd3
lrwxrwxrwx 1 root root  9 Aug 22 21:47 scsi-SATA_ST31500341AS_9VS2PGFF -> ../../sda
lrwxrwxrwx 1 root root 10 Aug 22 21:47 scsi-SATA_ST31500341AS_9VS2PGFF-part1 -> ../../sda1
lrwxrwxrwx 1 root root 10 Aug 22 21:47 scsi-SATA_ST31500341AS_9VS2PGFF-part2 -> ../../sda2
lrwxrwxrwx 1 root root 10 Aug 22 21:48 scsi-SATA_ST31500341AS_9VS2PGFF-part3 -> ../../sda3
Y este es el detalle del Local storage al que no puedo acceder:

Código:
uuid ( RO)                    : c29c02ee-c8cc-7dea-a92f-be8a94be44d7
              name-label ( RW): Local storage
        name-description ( RW): 
                    host ( RO): xen
      allowed-operations (SRO): forget; VDI.create; VDI.snapshot; plug; update; destroy; VDI.destroy; scan; VDI.clone; VDI.resize; unplug
      current-operations (SRO): 
                    VDIs (SRO): da7439ce-8039-4e95-90a2-730026939bdf; 8120d561-44a1-4995-a40c-2d8e4073572c; c00c50cf-0e03-4497-b61f-973f507aeeaf; af8813ef-58bf-49c1-b111-0dfefb67536a; 76f1ae67-fba1-4c01-a9ef-37095c302b28; 65722f81-1ae3-4532-b2ef-ecaad8dc9b1c; 39c87ea4-7333-4230-8347-095a521a6692; 269af34f-d081-48b8-9780-3285d467ba64; 6f07b50e-9441-4686-8899-adb8371296c2; dbf3a00f-d89b-4dab-8103-37dbde606abb; 8e6f4303-9693-4540-9138-a0f76c82b123; 1842cd34-20ea-4901-8422-db155707302f; f73aa258-2455-488b-8d01-e331d9a0a75a; 0872c706-bb4b-42f0-8efd-3ba134318dbf; 5ac16b88-f831-462f-b192-9f817b0d12e9
PBDs (SRO): 8c00324e-fc2b-6ace-d61e-c45a036ecc43
virtual-allocation ( RO): 1677851623424
physical-utilisation ( RO): 1853056090112
physical-size ( RO): 4468670201856
type ( RO): lvm
content-type ( RO): user
shared ( RW): false
other-config (MRW): dirty: ; i18n-original-value-name_label: Local storage; i18n-key: local-storage
sm-config (MRO): allocation: thick; use_vhd: true; devserial: 
blobs ( RO):
¡Muchas gracias!