Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Expanding asterisk in bash

I'm trying to run find, and exclude several directories listed in an array. I'm finding some weird behavior when it's expanding, though, which is causing me issues:

~/tmp> skipDirs=( "./dirB" "./dirC" )
~/tmp> bars=$(find . -name "bar*" -not \( -path "${skipDirs[0]}/*" $(printf -- '-o -path "%s/\*" ' "${skipDirs[@]:1}") \) -prune); echo $bars
./dirC/bar.txt ./dirA/bar.txt

This did not skip dirC as I wold have expected. The problem is that the print expands the quotes around "./dirC".

~/tmp> set -x 
+ set -x
~/tmp> bars=$(find . -name "bar*" -not \( -path "${skipDirs[0]}/*" $(printf -- '-o -path "%s/*" ' "${skipDirs[@]:1}") \) -prune); echo $bars
+++ printf -- '-o -path "%s/*" ' ./dirC
++ find . -name 'bar*' -not '(' -path './dirB/*' -o -path '"./dirC/*"' ')' -prune
+ bars='./dirC/bar.txt
./dirA/bar.txt'
+ echo ./dirC/bar.txt ./dirA/bar.txt
./dirC/bar.txt ./dirA/bar.txt

If I try to remove the quotes in the $(print..), then the * gets expanded immediately, which also gives the wrong results. Finally, if I remove the quotes and try to escape the *, then the \ escape character gets included as part of the filename in the find, and that does not work either. I'm wondering why the above does not work, and, what would work? I'm trying to avoid using eval if possible, but currently I'm not seeing a way around it.

Note: This is very similar to: Finding directories with find in bash using a exclude list, however, the posted solutions to that question seem to have the issues I listed above.

like image 359
John Avatar asked Jan 09 '23 20:01

John


2 Answers

The safe approach is to build your array explicitly:

#!/bin/bash

skipdirs=( "./dirB" "./dirC" )

skipdirs_args=( -false )
for i in "${skipdirs[@]}"; do
    args+=( -o -type d -path "$i" )
done

find . \! \( \( "${skipdirs_args[@]}" \) -prune \) -name 'bar*'

I slightly modify the logic in your find since you had a slight (logic) error in there: your command was:

find -name 'bar*' -not stuff_to_prune_the_dirs

How does find proceed? it will parse the files tree and when it finds a file (or directory) that matches bar* then it will apply the -not ... part. That's really not what you want! your -prune is never going to be applied!

Look at this instead:

find . \! \( -type d -path './dirA' -prune \)

Here find will completely prune the directory ./dirA and print everything else. Now it's among everything else that you want to apply the filter -name 'bar*'! the order is very important! there's a big difference between this:

find . -name 'bar*' \! \( -type d -path './dirA' -prune \)

and this:

find . \! \( -type d -path './dirA' -prune \) -name 'bar*'

The first one doesn't work as expected at all! The second one is fine.

Notes.

  • I'm using \! instead of -not as \! is POSIX, -not is an extension not specified by POSIX. You'll argue that -path is not POSIX either so it doesn't matter to use -not. That's a detail, use whatever you like.
  • You had to use some dirty trick to build your commands to skip your dir, as you had to consider the first term separately from the other. By initializing the array with -false, I don't have to consider any terms specially.
  • I'm specifying -type d so that I'm sure I'm pruning directories.
  • Since my pruning really applies to the directories, I don't have to include wildcards in my exclude terms. This is funny: your problem that seemingly is about wildcards that you can't handle disappears completely when you use find appropriately as explained above.
  • Of course, the method I gave really applies with wildcards too. For example, if you want to exclude/prune all subdirectories called baz inside subdirectories called foo, the skipdirs array given by

    skipdirs=( "./*/foo/baz" "./*/foo/*/baz" )
    

    will work fine!

like image 139
gniourf_gniourf Avatar answered Jan 15 '23 07:01

gniourf_gniourf


The issue here is that the quotes you are using on "%s/*" aren't doing what you think they are.

That is to say, you think you need the quotes on "%s/*" to prevent the results from the printf from being globbed however that isn't what is happening. Try the same thing without the directory separator and with files that start and end with double quotes and you'll see what I mean.

$ ls
"dirCfoo"
$ skipDirs=( "dirB" "dirC" )
$ printf '%s\n' -- -path "${skipDirs[0]}*" $(printf -- '-o -path "%s*" ' "${skipDirs[@]:1}")
-path
dirB*
-o
-path
"dirCfoo"
$ rm '"dirCfoo"'
$ printf -- '%s\n' -path "${skipDirs[0]}*" $(printf -- '-o -path "%s*" ' "${skipDirs[@]:1}")
-path
dirB*
-o
-path
"dirC*"

See what I mean? The quotes aren't being handled specially by the shell. They just happen not to glob in your case.

This issue is part of why things like what is discussed at http://mywiki.wooledge.org/BashFAQ/050 don't work.

To do what you want here I believe you need to create the find arguments array manually.

sD=(-path /dev/null)
for dir in "${skipDirs}"; do
    sD+=(-o -path "$dir")
done

and then expand "${sD[@]}" on the find command line (-not \( "${sD[@]}" \) or so).

And yes, I believe this makes the answer you linked to incorrect (though the other answer might work (for non-whitespace, etc. files) because of the array indirection that is going on.

like image 28
Etan Reisner Avatar answered Jan 15 '23 09:01

Etan Reisner