This is an old revision of the document!
Table of Contents
BASH - Globs
“Glob” is the common name for a set of Bash features that match or expand specific types of patterns.
Some synonyms for globbing (depending on the context in which it appears) are pattern matching, pattern expansion, filename expansion, and so on.
A glob may look like *.txt and, when used to match filenames, is sometimes called a “wildcard”.
Traditional shell globs use a very simple syntax, which is less expressive than a RegularExpression.
Most characters in a glob are treated literally, but a * matches 0 or more characters, a ? matches precisely one character, and […] matches any single character in a specified set (see Ranges below).
All globs are implicitly anchored at both start and end.
Summary
* | Matches any string, of any length |
foo* | Matches any string beginning with foo |
*x* | Matches any string containing an x (beginning, middle or end) |
*.tar.gz | Matches any string ending with .tar.gz |
*.[ch] | Matches any string ending with .c or .h |
foo? | Matches foot or foo$ but not fools |
Expansion of Glob - Filenames
Bash expands globs which appear unquoted in commands, by matching filenames relative to the current directory.
The expansion of the glob results in 1 or more words (0 or more, if certain options are set), and those words (filenames) are used in the command.
tar xvf *.tar # Expands to: tar xvf file1.tar file2.tar file42.tar ... # (which is generally not what one wants)
Expansion of Glob - Filename with Whitespace
Even if a file contains internal whitespace, the expansion of a glob that matches that file will still preserve each filename as a single word.
For example,
# This is safe even if a filename contains whitespace: for f in *.tar; do tar tvf "$f" done # But this one is not: for f in $(ls | grep '\.tar$'); do tar tvf "$f" done
In the second example above, the output of ls is filtered, and then the result of the whole pipeline is divided into words, to serve as iterative values for the loop.
This word-splitting will occur at internal whitespace within each filename, which makes it useless in the general case.
The first example has no such problem, because the filenames produced by the glob do not undergo any further word-splitting.
For more such examples, see BashPitfalls.
Pattern Matching
Globs are also used to match patterns in a few places in Bash.
The most traditional is in the case command:
case "$input" in [Yy]|'') confirm=1;; [Nn]*) confirm=0;; *) echo "I don't understand. Please try again.";; esac
Patterns (which are separated by | characters) are matched against the first word after the case itself.
The first pattern which matches, “wins”, causing the corresponding commands to be executed.
Comparison Globs
Bash also allows globs to appear on the right-hand side of a comparison inside a [[ command:
if [[ $output = *[Ee]rror* ]]; then ...
Pattern Stripping
Globs are used during parameter expansion to indicate patterns which may be stripped out, or replaced, during a substitution.
filename=${path##*/} # strip leading pattern that matches */ (be greedy) dirname=${path%/*} # strip trailing pattern matching /* (non-greedy) printf '%s\n' "${arr[@]}" # dump an array, one element per line printf '%s\n' "${arr[@]/error*/}" # dump array, removing error* if matched
Ranges
Globs can specify a range or class of characters, using square brackets.
This gives you the ability to match against a set of characters.
For example:
[abcd] | Matches a or b or c or d |
[a-d] | The same as above, if globasciiranges is set or your locale is C or POSIX. Otherwise, implementation-defined. |
[!aeiouAEIOU] | Matches any character except a, e, i, o, u and their uppercase counterparts |
[[:alnum:]] | Matches any alphanumeric character in the current locale (letter or number) |
[[:space:]] | Matches any whitespace character |
[![:space:]]</nowiki> | Matches any character that is not whitespace |
[[:digit:]_.] | Matches any digit, or _ or . |
In most shell implementations, one may also use ^ as the range negation character, e.g. [^[:space:]].
However, POSIX specifies ! for this role, and therefore ! is the standard choice.