Question on ZFS
Zak Smith
zak at computer.org
Wed Feb 14 00:21:27 UTC 2024
On Tue, Feb 13, 2024 at 05:03:36PM -0700, Bob Proulx wrote:
> Phil Marsh wrote:
> > Hi Bob, All,
> > I was wondering. Do you recommend using an SSD cache for ZFS, i.e. an L2ARC
> > cache?
>
> I was rather hoping Zak would have jumped in with a response on this
> one as I know Zak is running several large high performance arrays.
> But not having heard anything I will try to muddle through. :-)
Sorry! I was out in the mountains when this came through and meant to
respond but then got lost in to-do items when I got back.
I typically use arrays that are comprised of several vdevs, each vdev
being a raidz1 with a total of 3 disks. Lately these disks have
been Micron 7+ TB SSDs (SATA). So a typical zpool status is
something like this:
pool: poolajax2
state: ONLINE
scan: scrub repaired 0 in 6h3m with 0 errors on Sun Feb 11 07:03:05 2024
config:
NAME STATE READ WRITE CKSUM
poolajax2 ONLINE 0 0 0
raidz1-0 ONLINE 0 0 0
ata-Micron_5210_MTFDDAK7T6QDE_1951258907D ONLINE 0 0 0
ata-Micron_5210_MTFDDAK7T6QDE_19512589194 ONLINE 0 0 0
ata-Micron_5210_MTFDDAK7T6QDE_1951258908E ONLINE 0 0 0
raidz1-2 ONLINE 0 0 0
ata-Micron_5300_MTFDDAK7T6TDS_22153919723 ONLINE 0 0 0
ata-Micron_5300_MTFDDAK7T6TDS_22133A91C46 ONLINE 0 0 0
ata-Micron_5300_MTFDDAK7T6TDS_22133A91C42 ONLINE 0 0 0
These SSD's are large but not exceptionally fast. Splitting them up
like this allows more iops and good overall gross throughput (ie it
scales pretty much with the number of non-redundant drives -- in this
case 4* see below.)
on 1.97 ------Sequential Output------ --Sequential Input- --Random-
Concurrency 1 -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks--
Machine Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP /sec %CP
ajax 126G 142 99 891682 99 484110 80 419 99 1703975 90 15249 353
Latency 69761us 12634us 154ms 28101us 59982us 22794us
But I also use an L2arc ("cache") on an Optane NVME,
logs
nvme-INTEL_SSDPED1D280GA_PHMB7515005680CGN-part1 ONLINE 0 0 0
cache
nvme-INTEL_SSDPED1D280GA_PHMB7515006280CGN-part2 ONLINE 0 0 0
I am not sure if the log makes a performance difference (ie, I have
not measured it), but the cache definitely does. If I prefetch most
of my working set when I boot, the response time is much faster
loading from nvme than from the ssd array. e.g. a prefetch script:
#!/bin/sh
cd
find . -type f -size -10048576c | while read a; do
dd if="$a" of=/dev/null bs=1M > /dev/null 2>&1
echo -n "."
done
echo
Hope this helps
Zak
--
Zak Smith
307-543-7820 office
Please do not send private or confidential information via email.
More information about the NCLUG
mailing list