Note on rgrep/powershell/find topic from last night's meeting.

alan schmitz alan.schmitz88 at gmail.com
Wed Jan 12 14:11:04 MST 2022


That is great info, thank you!  One test I ran a long time ago wrt find is
the difference between:
find . -type f -exec grep notLklyToBeFnd {} \;
vs.
find . -type f | xargs grep notLklyToBeFnd

I'm not sure if it holds today, but when I ran it back then the xargs
version was much faster than the -exec.  Of course that was a very long
time ago.

Alan

On Wed, Jan 12, 2022 at 1:58 PM Brian Sturgill <bsturgill at ataman.com> wrote:

> I said I didn't know how fast PWSH's Select-String (grep) method was, but
> intend to run some
> benchmarks. The discussion evolved into 'grep -r' vs 'find ... -exec grep
> pat {}...).
>
> Here's some benchmarks.
> They are run on my main Linux and Windows desktop machines.
> Windows box slightly more powerful than the Linux box.
> Windows 11th gen i7 laptop chip [Intel nuc] w/32gb memory.
> Linux 8th gen i5 (8th gen, but recently reengineered... about as fast as
> 11th gen, somewhat worse power consumption), 16 gb mem.
> Both have good M.2 SSDs.
> My Linux version is Ubuntu 21.10 Mate.
> My Windows version is the latest stable build of Windows 11.
>
> I ran these commands on the same set of 6+GB of ebooks.
>
> Commands benchmarked:
> "rgrep": grep -r notLklyToBeFnd .
> "find" find . -type f -exec grep notLklyToBeFnd {} \;
> "pwsh": Get-ChildItem -Path "." -Recurse | Select-String -Pattern
> "notLklyToBeFnd" -CaseSensitive
>
> All times are in seconds.
>
> Linux machine, using bash and pwsh (cross-platform version of Microsoft
> PowerShell)
> "rgrep"   3.1
> "find"     5.5
> "pwsh"   90
>
> Windows machine using cross-platform PowerShell and bash in WSL 2.0):
> "rgrep"  4.6
> "find"     50
> "pwsh"   84
>
> I also tried it the older Windows/only variant of Powershell where "pwsh"
> took about 120 seconds.
>
> The problem with with "pwsh" seems to be with the "Select-String" command.
> On a large PDF file it took 600 milliseconds. Grep took only 27
> milliseconds.
>
> Note that the "rgrep" times were similar between my native linux box and
> WSL 2.0.
> But that the "find" one was 10x greater. I think the issue may be a
> caching issue between Ubuntu in WSL 2 and the Windows file system. The
> Ubuntu on WSL 2.0 may be giving up blocks quickly to avoid double
> caching... this hurts when using find to repeatedly start grep. My theory
> is that the grep code has to get copied back into the Ubuntu address space
> every time grep starts again.
> --
>
> Brian
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.nclug.org/pipermail/nclug/attachments/20220112/09414e3b/attachment.htm>


More information about the NCLUG mailing list