[NCLUG] sw raid, recovery after install

Bob Proulx bob at proulx.com
Wed Jan 2 16:52:18 MST 2013


Stephen Warren wrote:
> Bob Proulx wrote:
> > Also I think you put all of the disk space in one large array.  At 40%
> > your machine is estimating 235.8 minutes remaining.  Or 590 minutes,
> > ten hours, for the full raid sync.  That is fine.  But if you need to
> > reboot or have a power failure then the sync will restart ...
> 
> At least with RAID-1 on recent kernels, the kernel maintains some kind
> of checkpoint history, so at least a graceful reboot doesn't restart the
> sync at the start of the array, but rather roughly where it left off.

It is available but you have to enable it.  Something like:

  mdadm /dev/md1 --grow --bitmap=internal

Then you will see the additional bitmap line displayed in /proc/mdstat.

  md2 : active raid1 sda6[0] sdb6[1]
        312496256 blocks [2/2] [UU]
        bitmap: 0/3 pages [0KB], 65536KB chunk

I have had other people say that it reduces performance due to the
bookkeeping overhead.  I haven't benchmarked it but haven't noticed
any performance decrease with it on.  I have it enabled on my main
desktop but happen to have it off on other machines.  I would love
some real benchmark data for it.

> > These days I create partitions of about 250G per partition.  Probably
> > 30 minutes per 250G partition by memory on my machines.  Being able to
> > check off smaller partitions like that is nicer when doing a large
> > data recovery.
> 
> I have occasionally thought about doing this, but: The problem with this
> is you end up with a bunch of "tiny" storage devices. What happens when
> there's some kind of weird RAID initialization issue on reboot and all
> the arrays come up degraded, but half using one physical disk and the
> other half the other physical disk.

How often are you seeing "some kind of weird RAID initialization issue
on reboot" that causes an array to hard fail degrade to one disk?
That actually sounds pretty scary.  I'm not seeing those.

Take every time that you have ever seen a boot raid initialization
failure where it hard failed and degraded one disk in the array as a
software glitch not due to a hardware failure and determine that boot
software failure rate.  That will be a very small close to zero
number.  Having four partitions would be four times as likely.  I
think we are talking about extremely small numbers even with a time
four multiplier.

So I agree that it is possible in an absolute sense but very unlikely
in practice.

> At least with 1 big array, that kind of failure case is simply
> impossible.

True.

> And besides, the total resync time isn't any better, so
> it sure seems like adding complexity without much gain.

Agreed that total re-sync time isn't any different.  But it means that
I can be doing maintenance work and can reboot in the middle of the
re-sync and not lose much previously done sync time.  Usually when I
need that I wait until just after it has checked off one of the
partitions and moved to a new one so that I don't lose more than a few
minutes.

This split into smaller chunks is just my personal preference.

Note that I put LVM on top of these chunks so it has nothing at all to
do with the end file system sizes.  Those are independently set to
whatever I need them to be for that system.  And they can dynamically
grow and shrink as needed.

As long as you read to here let me also say that I always enable the
SMART daemon to do routine selftests too.  It isn't a predictor of
failure.  But it can make debugging easier by confirming failure.  And
an alternative email from the daemon when there is a failure.  And it
is all automatic and trouble free.  I typically scan like this in
/etc/smartd.conf:

  # Monitor all attributes, enable automatic offline data collection,
  # automatic Attribute autosave, and start a short self-test every
  # weekday between 2-3am, and a long self test Saturdays between 3-4am.
  # Ignore attribute 194 temperature change.
  # Ignore attribute 190 airflow temperature change.
  # On failure run all installed scripts.
  /dev/sda -a -o on -S on -s (S/../../[1-5]/03|L/../../6/03) -I 194 -I 190 -m root -M exec /usr/share/smartmontools/smartd-runner
  /dev/sdb -a -o on -S on -s (S/../../[1-5]/03|L/../../6/03) -I 194 -I 190 -m root -M exec /usr/share/smartmontools/smartd-runner

Bob



More information about the NCLUG mailing list