Note on rgrep/powershell/find topic from last night's meeting.
Brian Sturgill
bsturgill at ataman.com
Wed Jan 12 13:58:29 MST 2022
I said I didn't know how fast PWSH's Select-String (grep) method was, but
intend to run some
benchmarks. The discussion evolved into 'grep -r' vs 'find ... -exec grep
pat {}...).
Here's some benchmarks.
They are run on my main Linux and Windows desktop machines.
Windows box slightly more powerful than the Linux box.
Windows 11th gen i7 laptop chip [Intel nuc] w/32gb memory.
Linux 8th gen i5 (8th gen, but recently reengineered... about as fast as
11th gen, somewhat worse power consumption), 16 gb mem.
Both have good M.2 SSDs.
My Linux version is Ubuntu 21.10 Mate.
My Windows version is the latest stable build of Windows 11.
I ran these commands on the same set of 6+GB of ebooks.
Commands benchmarked:
"rgrep": grep -r notLklyToBeFnd .
"find" find . -type f -exec grep notLklyToBeFnd {} \;
"pwsh": Get-ChildItem -Path "." -Recurse | Select-String -Pattern
"notLklyToBeFnd" -CaseSensitive
All times are in seconds.
Linux machine, using bash and pwsh (cross-platform version of Microsoft
PowerShell)
"rgrep" 3.1
"find" 5.5
"pwsh" 90
Windows machine using cross-platform PowerShell and bash in WSL 2.0):
"rgrep" 4.6
"find" 50
"pwsh" 84
I also tried it the older Windows/only variant of Powershell where "pwsh"
took about 120 seconds.
The problem with with "pwsh" seems to be with the "Select-String" command.
On a large PDF file it took 600 milliseconds. Grep took only 27
milliseconds.
Note that the "rgrep" times were similar between my native linux box and
WSL 2.0.
But that the "find" one was 10x greater. I think the issue may be a caching
issue between Ubuntu in WSL 2 and the Windows file system. The Ubuntu on
WSL 2.0 may be giving up blocks quickly to avoid double caching... this
hurts when using find to repeatedly start grep. My theory is that the grep
code has to get copied back into the Ubuntu address space every time grep
starts again.
--
Brian
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.nclug.org/pipermail/nclug/attachments/20220112/cddd4a6a/attachment.htm>
More information about the NCLUG
mailing list