[NCLUG] sw raid, recovery after install

Wed Jan 2 16:28:21 MST 2013

On 01/02/2013 04:01 PM, Bob Proulx wrote:
> Matt Rosing wrote:
>> I added a raid 5 to my computer with 3 identical drives. I mounted it. I 
>> moved files to it. It looks like it's working. Then I looked at 
>> /proc/mdstat and it says it's recovering:
...
>> md0 : active raid5 sdb[0] sdd[3] sdc[1]
>>        2930274304 blocks super 1.2 level 5, 512k chunk, algorithm 2 [3/2] [UU_]
>>        [========>...]  recovery = 40.6% (595347620/1465137152) finish=235.8min speed=61467K/sec
> 
> Also I think you put all of the disk space in one large array.  At 40%
> your machine is estimating 235.8 minutes remaining.  Or 590 minutes,
> ten hours, for the full raid sync.  That is fine.  But if you need to
> reboot or have a power failure then the sync will restart ...

At least with RAID-1 on recent kernels, the kernel maintains some kind
of checkpoint history, so at least a graceful reboot doesn't restart the
sync at the start of the array, but rather roughly where it left off.

> These days I create partitions of about 250G per partition.  Probably
> 30 minutes per 250G partition by memory on my machines.  Being able to
> check off smaller partitions like that is nicer when doing a large
> data recovery.

I have occasionally thought about doing this, but: The problem with this
is you end up with a bunch of "tiny" storage devices. What happens when
there's some kind of weird RAID initialization issue on reboot and all
the arrays come up degraded, but half using one physical disk and the
other half the other physical disk. At least with 1 big array, that kind
of failure case is simply impossible. And besides, the total resync time
isn't any better, so it sure seems like adding complexity without much gain.