[NCLUG] Need help with ether md raid or possibly ext4 corrupted group descriptors

Michael Milligan milli at acmeps.com
Mon Feb 2 11:50:02 MST 2009


Mike Jensen wrote:

Got a backup?  ;-)

....

> Before that it does say that /dev/md0 is started with all three drives, so
> I just figured all I needed to do was fsck the array.  But when I boot to
> the fedora rescue CD it is not able to build the array automatically.  And
> I am not sure if I am assembling the array correctly.  I used the command:
> mdadm --build /dev/md0 --level=0 --raid-devices=3 /dev/sda2 /dev/sdb1
> /dev/sdc1

Oh dear, you needed to use the --assemble switch instead of --build, and
mdadm is smart enough to scan all the partitions, look for superblocks,
and find all the parts of /dev/md0 on it's own...  provided all is clean
and proper...  if something is amiss, there are a number of ways to
recover depending on the error...

Using --build will _ignore_ the superblocks and will likely putting
things together incorrectly as the mode assumes a pre-0.90 version
array.  The layout is different...  Unless of course you really do have
an old (pre-0.90) raid implementation, which is extremely unlikely
unless you built the original array more than 10 years ago.

You should "mdadm --stop /dev/md0", then do an "mdadm --examine --scan"
to see if the output looks good (this creates the same info that's in
/etc/mdadm.conf), and then perhaps "mdadm --examine /dev/sda2" and so on
to make sure all the partitions show "active sync".  If that's the case,
then you should be able to:

# mdadm --assemble /dev/md0

and the array should be automatically built properly.

If that doesn't work, then there are some other tricks to try and get it
back that are somewhat dangerous that I've used in the past to recover
from apparent double-drive failures on RAID5 and the like, which
involves "mdadm --create .... --assume-clean ...", which is why it's
dangerous.

The last thing to check is that the partition type on all of the RAID
parts is "fd", so the RAID boot codes knows to automatically build your
arrays at boot time.  I've never seen that change on it's own, but
always good to double-check it.

Regards,
Mike

PS: The double-drive failures were controller errors... didn't really
lose the drives.  But that prompted replacing the failing controller.

-- 
Michael Milligan                                   -> milli at acmeps.com



More information about the NCLUG mailing list