Machine IO slowing over time

Grant Johnson grant at amadensor.com
Thu Feb 5 23:08:55 UTC 2026


Thank you. If the move I did didn't help, I'll add more drives, build another array, and move all of the extents to that new array, then remove the bad disks from the system. 

Or, maybe I should just add more drives as hot spares, then take those other drives offline. No change to the mdadm shape at all.

On February 5, 2026 9:24:58 AM MST, Zak Smith <zak at computer.org> wrote:
>Hi,
>
>I had a situation, actually on a Unifi appliance, where the raid
>performance was dropping enough to cause it to be unfit for purpose,
>but not enough to cause system errors, or even log anything in dmesg.
>I was able to identify the single disk that was causing problems
>because "smartctl -a" would hang for 5-10 seconds when run on *that*
>drive, even though the output itself did not (at that time) show any
>indicators of failure.
>
>
>
>On Thu, Feb 05, 2026 at 08:01:39AM -0700, Grant Johnson wrote:
>> Based on that idea, since it is LVM across 2 RAID 1 arrays, I moved one
>> of my large partitions that seems to be the most trouble from the array
>> that was getting very high waits according to iostat to the other one
>> that is having less io wait.   We will see tonight if the daily
>> snapshots run any smoother not that I moved the bkup volume group to
>> the array with less trouble.
>>
>> Maybe those disks are faster, maybe the freshening of the data will
>> help.   Not sure, but good ideas from all of you sent me at least in a
>> new direction that I have not tried before.
>>
>>
>> On Tue, 2026-02-03 at 15:43 -0700, Bob Proulx wrote:
>> > Grant Johnson wrote:
>> > > What else should I be checking?
>> >
>> > Sometimes reading data from drives will start to take longer when the
>> > data on the drives is very static over years.  In those cases the
>> > data
>> > might degrade where more error correction from the drive is needed to
>> > read the data.  And the drive might have to retry reading those disk
>> > sectors more times to get a good read.
>> >
>> > This can be "fluffed" by reading and writing the entire drive.  I
>> > have
>> > seen some drives have dramatic improvement.  I have seen other drives
>> > have no change at all.
>> >
>> > The most direct way is to boot a live boot image so that nothing is
>> > touching the disk and then read and write the entire disk in place.
>> >
>> >     time dd if=/dev/sdX of=/dev/sdX bs=1M iflag=fullblock
>> > oflag=sync,direct
>> >
>> > Of course doing so can take a long time on a big drive.  I like to
>> > see
>> > progress.  The utility I like the best for this is "pv" pipe-viewer.
>> > If you have a new enough utility then it will have the --direct-io
>> > option.  But if not then use dd.
>> >
>> >     time pv /dev/sdX | dd of=/dev/sdX bs=1M iflag=fullblock
>> > oflag=sync,direct
>> >
>> > The pv utility can even be used as the reader and writer itself when
>> > using the --direct-io option.  But no control of block size so dd is
>> > still useful.
>> >
>> > Since the data being written is the data being read this is safe but
>> > only if nothing else is touching the drive at the same time.
>> >
>> > Bob
>> >
>
>--
>Zak Smith
>307-543-7820 office
>Please do not send private or confidential information via email.
>
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.nclug.org/pipermail/nclug/attachments/20260205/11af4fc2/attachment.htm>


More information about the NCLUG mailing list