Note on rgrep/powershell/find topic from last night's meeting.
Sean Reifschneider
jafo00 at gmail.com
Wed Jan 12 19:40:14 MST 2022
I've switched to this syntax instead of find|xargs:
find . -type f -exec grep whatever {} +
Or, mostly I just use:
ack whatever
On Wed, Jan 12, 2022 at 2:44 PM alan schmitz <alan.schmitz88 at gmail.com>
wrote:
> That is great info, thank you! One test I ran a long time ago wrt find is
> the difference between:
> find . -type f -exec grep notLklyToBeFnd {} \;
> vs.
> find . -type f | xargs grep notLklyToBeFnd
>
> I'm not sure if it holds today, but when I ran it back then the xargs
> version was much faster than the -exec. Of course that was a very long
> time ago.
>
> Alan
>
> On Wed, Jan 12, 2022 at 1:58 PM Brian Sturgill <bsturgill at ataman.com>
> wrote:
>
>> I said I didn't know how fast PWSH's Select-String (grep) method was, but
>> intend to run some
>> benchmarks. The discussion evolved into 'grep -r' vs 'find ... -exec grep
>> pat {}...).
>>
>> Here's some benchmarks.
>> They are run on my main Linux and Windows desktop machines.
>> Windows box slightly more powerful than the Linux box.
>> Windows 11th gen i7 laptop chip [Intel nuc] w/32gb memory.
>> Linux 8th gen i5 (8th gen, but recently reengineered... about as fast as
>> 11th gen, somewhat worse power consumption), 16 gb mem.
>> Both have good M.2 SSDs.
>> My Linux version is Ubuntu 21.10 Mate.
>> My Windows version is the latest stable build of Windows 11.
>>
>> I ran these commands on the same set of 6+GB of ebooks.
>>
>> Commands benchmarked:
>> "rgrep": grep -r notLklyToBeFnd .
>> "find" find . -type f -exec grep notLklyToBeFnd {} \;
>> "pwsh": Get-ChildItem -Path "." -Recurse | Select-String -Pattern
>> "notLklyToBeFnd" -CaseSensitive
>>
>> All times are in seconds.
>>
>> Linux machine, using bash and pwsh (cross-platform version of Microsoft
>> PowerShell)
>> "rgrep" 3.1
>> "find" 5.5
>> "pwsh" 90
>>
>> Windows machine using cross-platform PowerShell and bash in WSL 2.0):
>> "rgrep" 4.6
>> "find" 50
>> "pwsh" 84
>>
>> I also tried it the older Windows/only variant of Powershell where "pwsh"
>> took about 120 seconds.
>>
>> The problem with with "pwsh" seems to be with the "Select-String"
>> command. On a large PDF file it took 600 milliseconds. Grep took only 27
>> milliseconds.
>>
>> Note that the "rgrep" times were similar between my native linux box and
>> WSL 2.0.
>> But that the "find" one was 10x greater. I think the issue may be a
>> caching issue between Ubuntu in WSL 2 and the Windows file system. The
>> Ubuntu on WSL 2.0 may be giving up blocks quickly to avoid double
>> caching... this hurts when using find to repeatedly start grep. My theory
>> is that the grep code has to get copied back into the Ubuntu address space
>> every time grep starts again.
>> --
>>
>> Brian
>>
>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.nclug.org/pipermail/nclug/attachments/20220112/96f81465/attachment.htm>
More information about the NCLUG
mailing list