[NCLUG] bootable software raid 1?

Michael Milligan milli at acmeps.com
Sat Jun 14 22:16:24 MDT 2003


Erich wrote:
> I successfully installed KRUD8.0 on twin 40GB IDE drives by
> partitioning one and then changing all partitions to type
> raid, and then cloning them onto the second drive.  I am
> attempting to simulate a drive failure by removing one
> drive, and replacing with a blank of the same capacity. 

You can't just throw in a "blank", it has to have a proper partition 
table set up, one that matches the failed disk.  For RAID 1, in effect, 
you just match the partition table of the good disk.

Perhaps that's what you meant.

> I have obviously blundered.  I want to have a mirrored set
> up taht will allow for a complete disk failure and still
> allow bot/use of the machine.
> 
> Is this possible with software raid-1, and if so, what am I
> doing wrong?

Even though software RAID 1 basically only puts a RAID superblock at the 
end of the disk, I've never counted on this to always be the case.  I've 
always create a small (20gig-ish) partition that gets mounted as /boot, 
so lilo/grub will always see a normal ext2/3 partition to boot from. 
E.g., /dev/hda1.  I then mirror this setup on the second disk, i.e., 
/dev/hdb1.  At the stage where you get the system all squared away and 
booting up fine with RAID autodetect partition types, you just:

# dd if=/dev/hda1 of=/dev/hdb1 bs=4k
# fsck /dev/hdb1

The fsck is needed because you probably just cloned a mounted partition 
(/boot), so it's not marked as clean.  You want it to be clean.  ;-)

For LILO then, you need to change the "boot=" line to point at /dev/hdb, 
then run lilo so it installs a boot block on the second drive.  That's 
it for lilo.  Do the same if you use grub.

All set!  You can now boot off of either drive, i.e., no matter which 
one fails.  The RAID stuff is smart enough to know that drive orders 
have switched too.

The only caveat... ongoing maintenance.  I have caught myself installing 
new kernels and forgetting to do the dd and fsck onto the other disk. 
That can be suprising if you have a failure (boots up with an old 
kernel), but not usually fatal.

Regards,
Mike

-- 
Michael Milligan  --  Free Agent  --  milli at acmeps.com




More information about the NCLUG mailing list