bash:globs
Differences
This shows you the differences between two versions of the page.
Both sides previous revisionPrevious revisionNext revision | Previous revision | ||
bash:globs [2019/12/07 12:53] – peter | bash:globs [2021/02/04 09:43] (current) – [Glob Star - globstar] peter | ||
---|---|---|---|
Line 1: | Line 1: | ||
====== BASH - Globs ====== | ====== BASH - Globs ====== | ||
- | "Glob" | + | **Glob** or **Globstar** |
Some synonyms for globbing (depending on the context in which it appears) are pattern matching, pattern expansion, filename expansion, and so on. | Some synonyms for globbing (depending on the context in which it appears) are pattern matching, pattern expansion, filename expansion, and so on. | ||
Line 7: | Line 7: | ||
A glob may look like ***.txt** and, when used to match filenames, is sometimes called a " | A glob may look like ***.txt** and, when used to match filenames, is sometimes called a " | ||
- | Traditional shell globs use a very simple syntax, which is less expressive than a RegularExpression. | + | Traditional shell globs use a very simple syntax, which is less expressive than a Regular Expression. |
Most characters in a glob are treated literally, but a * matches 0 or more characters, a ? matches precisely one character, and [...] matches any single character in a specified set (see Ranges below). | Most characters in a glob are treated literally, but a * matches 0 or more characters, a ? matches precisely one character, and [...] matches any single character in a specified set (see Ranges below). | ||
Line 35: | Line 35: | ||
tar xvf *.tar | tar xvf *.tar | ||
# Expands to: tar xvf file1.tar file2.tar file42.tar ... | # Expands to: tar xvf file1.tar file2.tar file42.tar ... | ||
- | # (which is generally not what one wants) | ||
</ | </ | ||
Line 49: | Line 48: | ||
# This is safe even if a filename contains whitespace: | # This is safe even if a filename contains whitespace: | ||
for f in *.tar; do | for f in *.tar; do | ||
- | | + | |
done | done | ||
# But this one is not: | # But this one is not: | ||
for f in $(ls | grep ' | for f in $(ls | grep ' | ||
- | | + | |
done | done | ||
</ | </ | ||
- | In the second example above, the output of **ls** is filtered, and then the result of the whole pipeline is divided into words, to serve as iterative values for the loop. | + | <WRAP info> |
+ | **NOTE: | ||
This word-splitting will occur at internal whitespace within each filename, which makes it useless in the general case. | This word-splitting will occur at internal whitespace within each filename, which makes it useless in the general case. | ||
Line 65: | Line 65: | ||
For more such examples, see BashPitfalls. | For more such examples, see BashPitfalls. | ||
+ | |||
+ | </ | ||
---- | ---- | ||
Line 76: | Line 78: | ||
<code bash> | <code bash> | ||
case " | case " | ||
- | | + | |
- | [Nn]*) confirm=0;; | + | [Nn]*) confirm=0;; |
- | *) echo "I don't understand. | + | *) echo "I don't understand. |
esac | esac | ||
</ | </ | ||
- | Patterns (which are separated by | characters) are matched against the first word after the case itself. | + | <WRAP info> |
+ | **NOTE: | ||
The first pattern which matches, " | The first pattern which matches, " | ||
+ | |||
+ | </ | ||
---- | ---- | ||
Line 128: | Line 133: | ||
|< | |< | ||
- | In most shell implementations, | + | <WRAP info> |
+ | **NOTE: | ||
However, POSIX specifies ! for this role, and therefore ! is the standard choice. | However, POSIX specifies ! for this role, and therefore ! is the standard choice. | ||
Recent Bash versions Interpret [a-d] as [abcd]. | Recent Bash versions Interpret [a-d] as [abcd]. | ||
+ | |||
+ | </ | ||
---- | ---- | ||
- | ===== Options which change globbing | + | ===== Options which change globbing |
==== Extended Globs - extglob ==== | ==== Extended Globs - extglob ==== | ||
Line 163: | Line 171: | ||
</ | </ | ||
- | Patterns in a list are separated by | characters. | + | <WRAP info> |
+ | **NOTE: | ||
+ | </ | ||
---- | ---- | ||
Line 174: | Line 185: | ||
# To remove all the files except ones matching *.jpg: | # To remove all the files except ones matching *.jpg: | ||
rm !(*.jpg) | rm !(*.jpg) | ||
+ | |||
# All except *.jpg and *.gif and *.png: | # All except *.jpg and *.gif and *.png: | ||
rm !(*.jpg|*.gif|*.png) | rm !(*.jpg|*.gif|*.png) | ||
Line 181: | Line 193: | ||
<code bash> | <code bash> | ||
- | # To copy all the MP3 songs except one to your device | + | # To copy all the MP3 songs except one to your device. |
cp !(04*).mp3 /mnt | cp !(04*).mp3 /mnt | ||
</ | </ | ||
Line 192: | Line 204: | ||
<code bash> | <code bash> | ||
- | # To trim leading and trailing whitespace from a variable | + | # To trim leading and trailing whitespace from a variable. |
x=${x## | x=${x## | ||
</ | </ | ||
Line 206: | Line 218: | ||
</ | </ | ||
- | extglob changes the way certain characters are parsed. | + | |
<WRAP info> | <WRAP info> | ||
- | It is necessary to have a newline (not just a semicolon) between shopt -s extglob and any subsequent commands to use it. | + | **NOTE: |
+ | |||
+ | It is necessary to have a newline (not just a semicolon) between | ||
You cannot enable extended globs inside a group command that uses them, because the entire block is parsed before the **shopt** is evaluated. | You cannot enable extended globs inside a group command that uses them, because the entire block is parsed before the **shopt** is evaluated. | ||
- | Note that the typical function body is a group command. An unpleasant workaround could be to use a subshell command list as the function body. | + | The typical function body is a group command. |
</ | </ | ||
Line 220: | Line 234: | ||
<code bash> | <code bash> | ||
# | # | ||
- | shopt -s extglob | + | shopt -s extglob |
</ | </ | ||
Line 226: | Line 240: | ||
<code bash> | <code bash> | ||
- | # remember | + | # Remember |
shopt -q extglob; extglob_set=$? | shopt -q extglob; extglob_set=$? | ||
- | # set extglob if it wasn't originally set. | + | # Set extglob if it wasn't originally set. |
((extglob_set)) && shopt -s extglob | ((extglob_set)) && shopt -s extglob | ||
# Note, 0 (true) from shopt -q is " | # Note, 0 (true) from shopt -q is " | ||
# The basic concept behind the following is to delay parsing of the globs until evaluation. | # The basic concept behind the following is to delay parsing of the globs until evaluation. | ||
- | # This matters at group commands, such as functions in { } blocks | + | # This matters at group commands, such as functions in { } blocks. |
declare -a s='( !(x) )' | declare -a s='( !(x) )' | ||
Line 242: | Line 256: | ||
eval 'echo !(x)' | eval 'echo !(x)' | ||
- | # unset extglob if it wasn't originally set | + | # Unset extglob if it wasn't originally set. |
((extglob_set)) && shopt -u extglob | ((extglob_set)) && shopt -u extglob | ||
Line 258: | Line 272: | ||
ls: cannot access *.c: No such file or directory | ls: cannot access *.c: No such file or directory | ||
- | # with nullglob set | + | # With nullglob set. |
shopt -s nullglob | shopt -s nullglob | ||
ls *.c | ls *.c | ||
Line 279: | Line 293: | ||
<WRAP important> | <WRAP important> | ||
- | **Warning:** Enabling nullglob on a wide scope can trigger bugs caused by bad programming practices. | + | **WARNING:** Enabling nullglob on a wide scope can trigger bugs caused by bad programming practices. |
It " | It " | ||
Line 289: | Line 303: | ||
unset array[1] | unset array[1] | ||
#unsets nothing | #unsets nothing | ||
+ | |||
unset -v " | unset -v " | ||
#correct | #correct | ||
Line 298: | Line 313: | ||
shopt -s nullglob | shopt -s nullglob | ||
array=([1]=*) | array=([1]=*) | ||
- | #results | + | # Results |
</ | </ | ||
- | This was reported as a bug in 2012, yet is unchanged to this day. | + | This was reported as a [[http:// |
+ | </ | ||
Apart from few builtins that use modified parsing under special conditions (e.g. declare) always use Quotes when arguments to simple commands could be interpreted as globs. | Apart from few builtins that use modified parsing under special conditions (e.g. declare) always use Quotes when arguments to simple commands could be interpreted as globs. | ||
Line 308: | Line 324: | ||
To prevent pathname expansion occurring in unintended places, you can set **failglob**. | To prevent pathname expansion occurring in unintended places, you can set **failglob**. | ||
- | </ | + | |
---- | ---- | ||
+ | ==== Null Glob Portability ==== | ||
- | Portability | + | "null globbing" |
- | "null globbing" | + | In portable scripts, you must explicitly check that a glob match was successful by checking that the files actually exist. |
- | + | ||
- | Toggle line numbers | + | |
+ | <code bash> | ||
# POSIX | # POSIX | ||
for x in *; do | for x in *; do | ||
- | | + | |
- | ... | + | ... |
done | done | ||
f() { | f() { | ||
- | | + | |
- | | + | |
- | ... | + | ... |
- | done | + | done |
} | } | ||
f * || echo "No files found" | f * || echo "No files found" | ||
+ | </ | ||
Some modern POSIX-compatible shells allow null globbing as an extension. | Some modern POSIX-compatible shells allow null globbing as an extension. | ||
- | Toggle line numbers | + | <code bash> |
# Bash | # Bash | ||
shopt -s nullglob | shopt -s nullglob | ||
+ | </ | ||
- | In ksh93, there is no toggle-able option. Rather, that the " | + | In ksh93, there is no toggle-able option. |
- | + | ||
- | Toggle line numbers | + | |
+ | <code bash> | ||
# ksh93 | # ksh93 | ||
for x in ~(N)*; do | for x in ~(N)*; do | ||
- | | + | |
done | done | ||
+ | </ | ||
In zsh, an toggle-able option(NULL_GLOB) or a glob qualifier(N) can be used. | In zsh, an toggle-able option(NULL_GLOB) or a glob qualifier(N) can be used. | ||
- | Toggle line numbers | + | <code bash> |
# zsh | # zsh | ||
for x in *(N); do ...; done # or setopt NULL_GLOB | for x in *(N); do ...; done # or setopt NULL_GLOB | ||
+ | </ | ||
mksh doesn' | mksh doesn' | ||
- | dotglob | + | ---- |
- | By convention, a filename beginning with a dot is " | + | ===== Dot Glob - dotglob |
- | Toggle line numbers | + | By convention, a filename beginning with a dot is " |
+ | Globbing uses the same convention -- filenames beginning with a dot are not matched by a glob, unless the glob also begins with a dot. | ||
+ | |||
+ | Bash has a dotglob option that lets globs match "dot files": | ||
+ | |||
+ | <code bash> | ||
shopt -s dotglob nullglob | shopt -s dotglob nullglob | ||
files=(*) | files=(*) | ||
echo "There are ${# | echo "There are ${# | ||
+ | </ | ||
- | It should be noted that when dotglob is enabled, * will match files like .bashrc but not the . or .. directories. This is orthogonal to the problem of matching "just the dot files" -- a glob of .* will match . and .., typically causing problems. | + | It should be noted that when dotglob is enabled, * will match files like .bashrc but not the . or .. directories. |
- | globstar (since bash 4.0-alpha) | + | This is orthogonal to the problem of matching "just the dot files" |
- | globstar recursively repeats a pattern containing ' | + | ---- |
- | Toggle line numbers | + | ===== Glob Star - globstar ===== |
- | $ shopt -s globstar; tree | + | (since bash 4.0-alpha) |
+ | |||
+ | globstar recursively repeats a pattern containing **< | ||
+ | |||
+ | <code bash> | ||
+ | shopt -s globstar; tree | ||
. | . | ||
├── directory2 | ├── directory2 | ||
Line 388: | Line 416: | ||
├── file1 | ├── file1 | ||
└── file2.c | └── file2.c | ||
+ | </ | ||
- | # Suppose that for the following examples. | + | Suppose that for the following examples. |
Matching files: | Matching files: | ||
- | Toggle line numbers | + | <code bash> |
$ files=(**) | $ files=(**) | ||
# equivalent to: files=(* */* */*/*) | # equivalent to: files=(* */* */*/*) | ||
Line 404: | Line 432: | ||
# corresponds to: find -name " | # corresponds to: find -name " | ||
# Caveat: **.c will not work, as it expands to *.c/*.c/… | # Caveat: **.c will not work, as it expands to *.c/*.c/… | ||
+ | </ | ||
- | Just like '*', '**' followed by a '/' | + | <WRAP info> |
+ | **NOTE:** To disable globstar use | ||
- | Toggle line numbers | + | <code bash> |
+ | shopt -u globstar | ||
+ | </ | ||
- | $ files=(**/) | + | See: |
- | # finds all subdirectories | + | </ |
- | $ files=(. **/) | ||
- | # finds all subdirectories, | ||
- | # corresponds to: find -type d | ||
- | failglob | + | ---- |
- | If a pattern fails to match, bash reports an expansion error. This can be useful at the commandline: | + | Assume you have a folder structure: |
- | Toggle line numbers | + | < |
+ | . | ||
+ | ├── bar | ||
+ | │ | ||
+ | │ | ||
+ | │ | ||
+ | │ | ||
+ | │ | ||
+ | │ | ||
+ | └── fnord.txt | ||
+ | </ | ||
+ | Then **ls** with single star **< | ||
+ | |||
+ | <code bash> | ||
+ | ls *.txt | ||
+ | fnord.txt | ||
+ | </ | ||
+ | |||
+ | The double star operator **< | ||
+ | |||
+ | <code bash> | ||
+ | ls **/*.txt | ||
+ | bar/ | ||
+ | </ | ||
+ | |||
+ | ---- | ||
+ | |||
+ | <WRAP info> | ||
+ | **NOTE: | ||
+ | </ | ||
+ | |||
+ | |||
+ | <code bash> | ||
+ | files=(**/) | ||
+ | # Finds all subdirectories. | ||
+ | |||
+ | files=(. **/) | ||
+ | # Finds all subdirectories, | ||
+ | # Corresponds to: find -type d. | ||
+ | </ | ||
+ | |||
+ | ---- | ||
+ | |||
+ | ===== Fail Glob - failglob ===== | ||
+ | |||
+ | If a pattern fails to match, bash reports an expansion error. | ||
+ | |||
+ | This can be useful at the commandline: | ||
+ | |||
+ | <code bash> | ||
# Good at the command line! | # Good at the command line! | ||
$ > *.foo # creates file ' | $ > *.foo # creates file ' | ||
Line 427: | Line 505: | ||
$ > *.foo # doesn' | $ > *.foo # doesn' | ||
-bash: no match: *.foo | -bash: no match: *.foo | ||
+ | </ | ||
- | GLOBIGNORE | + | ---- |
- | The Bash variable (not shopt) | + | ===== GLOBIGNORE |
- | Toggle line numbers | + | The Bash variable (not shopt) GLOBIGNORE allows you to specify patterns a glob should not match. |
+ | This lets you work around the infamous "I want to match all of my dot files, but not . or .." problem: | ||
+ | |||
+ | <code bash> | ||
$ echo .* | $ echo .* | ||
. .. .bash_history .bash_logout .bashrc .inputrc .vimrc | . .. .bash_history .bash_logout .bashrc .inputrc .vimrc | ||
Line 439: | Line 521: | ||
$ echo .* | $ echo .* | ||
.bash_history .bash_logout .bashrc .inputrc .vimrc | .bash_history .bash_logout .bashrc .inputrc .vimrc | ||
+ | </ | ||
- | Unset GLOBIGNORE | + | ---- |
- | Toggle line numbers | + | ==== Unset GLOBIGNORE ==== |
+ | <code bash> | ||
$ GLOBIGNORE= | $ GLOBIGNORE= | ||
$ echo .* | $ echo .* | ||
. .. .bash_history .bash_logout .bashrc .inputrc .vimrc | . .. .bash_history .bash_logout .bashrc .inputrc .vimrc | ||
+ | </ | ||
- | nocasematch | ||
- | Globs inside [[ and case commands are matched case-insensitive: | + | ---- |
- | Toggle line numbers | + | ===== No Case Match - nocasematch ===== |
+ | Globs inside < | ||
+ | |||
+ | <code bash> | ||
foo() { | foo() { | ||
- | local f r=0 nc=0 | + | |
- | | + | shopt -q nocasematch && nc=1 || shopt -s nocasematch |
- | | + | for f; do |
- | [[ $f = *.@(txt|jpg) ]] || continue | + | [[ $f = *.@(txt|jpg) ]] || continue |
- | cmd -on " | + | cmd -on " |
- | | + | done |
- | | + | ((nc)) || shopt -u nocasematch |
- | | + | return $r |
} | } | ||
+ | </ | ||
This is conventionally done this way: | This is conventionally done this way: | ||
- | Toggle line numbers | + | <code bash> |
case $f in | case $f in | ||
- | | + | |
- | *) continue | + | *) continue |
esac | esac | ||
+ | </ | ||
and in earlier versions of bash we'd use a similar glob: | and in earlier versions of bash we'd use a similar glob: | ||
- | Toggle line numbers | + | <code bash> |
[[ $f = *.@([Tt][Xx][Tt]|[Jj][Pp][Gg]) ]] || continue | [[ $f = *.@([Tt][Xx][Tt]|[Jj][Pp][Gg]) ]] || continue | ||
+ | </ | ||
or with no extglob: | or with no extglob: | ||
- | Toggle line numbers | + | <code bash> |
[[ $f = *.[Tt][Xx][Tt] ]] || [[ $f = *.[Jj][Pp][Gg] ]] || continue | [[ $f = *.[Tt][Xx][Tt] ]] || [[ $f = *.[Jj][Pp][Gg] ]] || continue | ||
+ | </ | ||
- | Here, one might keep the tests separate for maintenance; | + | Here, one might keep the tests separate for maintenance; |
- | + | ||
- | | + | |
Note also: | Note also: | ||
- | Toggle line numbers | + | <code bash> |
[[ $f = *.@([Tt][Xx][Tt]|[Jj][Pp]? | [[ $f = *.@([Tt][Xx][Tt]|[Jj][Pp]? | ||
+ | </ | ||
Variants left as an exercise. | Variants left as an exercise. | ||
- | nocaseglob (since bash 2.02-alpha1) | + | ---- |
+ | |||
+ | ===== No Case Glob - nocaseglob | ||
+ | |||
+ | (since bash 2.02-alpha1) | ||
+ | |||
+ | This option makes pathname expansion case-insensitive. | ||
- | This option makes pathname expansion case-insensitive. | + | In contrast, nocasematch operates on matches in < |
bash/globs.1575723205.txt.gz · Last modified: 2020/07/15 09:30 (external edit)