Note on rgrep/powershell/find topic from last night's meeting.

alan schmitz alan.schmitz88 at gmail.com
Wed Jan 12 19:55:54 MST 2022


Nice 👍

On Wed, Jan 12, 2022, 7:40 PM Sean Reifschneider <jafo00 at gmail.com> wrote:

> I've switched to this syntax instead of find|xargs:
>
> find . -type f -exec grep whatever {} +
>
> Or, mostly I just use:
>
> ack whatever
>
> On Wed, Jan 12, 2022 at 2:44 PM alan schmitz <alan.schmitz88 at gmail.com>
> wrote:
>
>> That is great info, thank you!  One test I ran a long time ago wrt find
>> is the difference between:
>> find . -type f -exec grep notLklyToBeFnd {} \;
>> vs.
>> find . -type f | xargs grep notLklyToBeFnd
>>
>> I'm not sure if it holds today, but when I ran it back then the xargs
>> version was much faster than the -exec.  Of course that was a very long
>> time ago.
>>
>> Alan
>>
>> On Wed, Jan 12, 2022 at 1:58 PM Brian Sturgill <bsturgill at ataman.com>
>> wrote:
>>
>>> I said I didn't know how fast PWSH's Select-String (grep) method was,
>>> but intend to run some
>>> benchmarks. The discussion evolved into 'grep -r' vs 'find ... -exec
>>> grep pat {}...).
>>>
>>> Here's some benchmarks.
>>> They are run on my main Linux and Windows desktop machines.
>>> Windows box slightly more powerful than the Linux box.
>>> Windows 11th gen i7 laptop chip [Intel nuc] w/32gb memory.
>>> Linux 8th gen i5 (8th gen, but recently reengineered... about as fast as
>>> 11th gen, somewhat worse power consumption), 16 gb mem.
>>> Both have good M.2 SSDs.
>>> My Linux version is Ubuntu 21.10 Mate.
>>> My Windows version is the latest stable build of Windows 11.
>>>
>>> I ran these commands on the same set of 6+GB of ebooks.
>>>
>>> Commands benchmarked:
>>> "rgrep": grep -r notLklyToBeFnd .
>>> "find" find . -type f -exec grep notLklyToBeFnd {} \;
>>> "pwsh": Get-ChildItem -Path "." -Recurse | Select-String -Pattern
>>> "notLklyToBeFnd" -CaseSensitive
>>>
>>> All times are in seconds.
>>>
>>> Linux machine, using bash and pwsh (cross-platform version of Microsoft
>>> PowerShell)
>>> "rgrep"   3.1
>>> "find"     5.5
>>> "pwsh"   90
>>>
>>> Windows machine using cross-platform PowerShell and bash in WSL 2.0):
>>> "rgrep"  4.6
>>> "find"     50
>>> "pwsh"   84
>>>
>>> I also tried it the older Windows/only variant of Powershell where
>>> "pwsh" took about 120 seconds.
>>>
>>> The problem with with "pwsh" seems to be with the "Select-String"
>>> command. On a large PDF file it took 600 milliseconds. Grep took only 27
>>> milliseconds.
>>>
>>> Note that the "rgrep" times were similar between my native linux box and
>>> WSL 2.0.
>>> But that the "find" one was 10x greater. I think the issue may be a
>>> caching issue between Ubuntu in WSL 2 and the Windows file system. The
>>> Ubuntu on WSL 2.0 may be giving up blocks quickly to avoid double
>>> caching... this hurts when using find to repeatedly start grep. My theory
>>> is that the grep code has to get copied back into the Ubuntu address space
>>> every time grep starts again.
>>> --
>>>
>>> Brian
>>>
>>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.nclug.org/pipermail/nclug/attachments/20220112/94d5297e/attachment-0001.htm>


More information about the NCLUG mailing list