Note on rgrep/powershell/find topic from last night's meeting.

Bob Proulx bob at proulx.com
Wed Jan 12 21:22:30 MST 2022


alan schmitz wrote:
> That is great info, thank you!  One test I ran a long time ago wrt find is
> the difference between:
> find . -type f -exec grep notLklyToBeFnd {} \;
> vs.
> find . -type f | xargs grep notLklyToBeFnd
> 
> I'm not sure if it holds today, but when I ran it back then the xargs
> version was much faster than the -exec.  Of course that was a very long
> time ago.

In the above find since \; is used it means that it will repeatedly
execute grep once per file.  Conceptually it is similar to this with
lots of grep invocations.

    for file in $(find . -type f -print); do
        grep $file
    done

But the xargs will spool up as many files as possible for one grep.
The file I/O is the same but the process fork() & exec() is reduced to
the minimum for the xargs case.  But there is that pipe where file
names are passed as character I/O from process to process.
Conceptually somewhat similar to this.

    grep PATTERN $(find . -type f -print | while IFS= read -r file; do echo $file; done)

That's why it is better to use + instead of \; as + invokes grep just
once with as many args as possible.  It's doing exactly what xargs did
but doing it completely internal to find.  Which saves not only the
process creation time but also the pipe I/O writing from find into the
pipe and reading from the pipe by xargs.  Since it is all in find now
that character I/O between processes is no longer needed.

    find . -type f -exec grep PATTERN {} +

Conceptually it is similar to this following, but avoiding all
problems of file whitespace and special characters.

    grep PATTERN $(find . -type f -print)

Why isn't + as well known as \; is?  Because + is *new* in the grand
scheme of things.  It was introduced in 2005.  Or as I say, just the
other day!  So if you learned how to use find before 2005 you learned
the \; syntax.  But it's been 17 years now and -exec ... {} + is a
POSIX standard that all finds must implement.  Most people entering
the work force today never learned the old syntax.

I will end with repeating the current best practice.

    find . -type f -exec grep PATTERN {} +

Bob


More information about the NCLUG mailing list