bash:globs
Differences
This shows you the differences between two versions of the page.
Both sides previous revisionPrevious revisionNext revision | Previous revision | ||
bash:globs [2019/12/07 12:31] – peter | bash:globs [2021/02/04 09:43] (current) – [Glob Star - globstar] peter | ||
---|---|---|---|
Line 1: | Line 1: | ||
====== BASH - Globs ====== | ====== BASH - Globs ====== | ||
- | "Glob" | + | **Glob** or **Globstar** |
Some synonyms for globbing (depending on the context in which it appears) are pattern matching, pattern expansion, filename expansion, and so on. | Some synonyms for globbing (depending on the context in which it appears) are pattern matching, pattern expansion, filename expansion, and so on. | ||
Line 7: | Line 7: | ||
A glob may look like ***.txt** and, when used to match filenames, is sometimes called a " | A glob may look like ***.txt** and, when used to match filenames, is sometimes called a " | ||
- | Traditional shell globs use a very simple syntax, which is less expressive than a RegularExpression. | + | Traditional shell globs use a very simple syntax, which is less expressive than a Regular Expression. |
Most characters in a glob are treated literally, but a * matches 0 or more characters, a ? matches precisely one character, and [...] matches any single character in a specified set (see Ranges below). | Most characters in a glob are treated literally, but a * matches 0 or more characters, a ? matches precisely one character, and [...] matches any single character in a specified set (see Ranges below). | ||
Line 35: | Line 35: | ||
tar xvf *.tar | tar xvf *.tar | ||
# Expands to: tar xvf file1.tar file2.tar file42.tar ... | # Expands to: tar xvf file1.tar file2.tar file42.tar ... | ||
- | # (which is generally not what one wants) | ||
</ | </ | ||
Line 49: | Line 48: | ||
# This is safe even if a filename contains whitespace: | # This is safe even if a filename contains whitespace: | ||
for f in *.tar; do | for f in *.tar; do | ||
- | | + | |
done | done | ||
# But this one is not: | # But this one is not: | ||
for f in $(ls | grep ' | for f in $(ls | grep ' | ||
- | | + | |
done | done | ||
</ | </ | ||
- | In the second example above, the output of **ls** is filtered, and then the result of the whole pipeline is divided into words, to serve as iterative values for the loop. | + | <WRAP info> |
+ | **NOTE: | ||
This word-splitting will occur at internal whitespace within each filename, which makes it useless in the general case. | This word-splitting will occur at internal whitespace within each filename, which makes it useless in the general case. | ||
Line 65: | Line 65: | ||
For more such examples, see BashPitfalls. | For more such examples, see BashPitfalls. | ||
+ | |||
+ | </ | ||
---- | ---- | ||
Line 76: | Line 78: | ||
<code bash> | <code bash> | ||
case " | case " | ||
- | | + | |
- | [Nn]*) confirm=0;; | + | [Nn]*) confirm=0;; |
- | *) echo "I don't understand. | + | *) echo "I don't understand. |
esac | esac | ||
</ | </ | ||
- | Patterns (which are separated by | characters) are matched against the first word after the case itself. | + | <WRAP info> |
+ | **NOTE: | ||
The first pattern which matches, " | The first pattern which matches, " | ||
+ | |||
+ | </ | ||
---- | ---- | ||
Line 111: | Line 116: | ||
---- | ---- | ||
+ | |||
+ | ===== Ranges ===== | ||
+ | |||
+ | Globs can specify a range or class of characters, using square brackets. | ||
+ | |||
+ | This gives you the ability to match against a set of characters. | ||
+ | |||
+ | For example: | ||
+ | |||
+ | |[abcd]|Matches a or b or c or d| | ||
+ | |[a-d]|The same as above, if globasciiranges is set or your locale is C or POSIX. Otherwise, implementation-defined.| | ||
+ | |[!aeiouAEIOU]|Matches any character except a, e, i, o, u and their uppercase counterparts| | ||
+ | |< | ||
+ | |< | ||
+ | |< | ||
+ | |< | ||
+ | |||
+ | <WRAP info> | ||
+ | **NOTE: | ||
+ | |||
+ | However, POSIX specifies ! for this role, and therefore ! is the standard choice. | ||
+ | |||
+ | Recent Bash versions Interpret [a-d] as [abcd]. | ||
+ | |||
+ | </ | ||
+ | |||
+ | ---- | ||
+ | |||
+ | ===== Options which change globbing behavior ===== | ||
+ | |||
+ | ==== Extended Globs - extglob ==== | ||
+ | |||
+ | In addition to the traditional globs (supported by all Bourne-family shells) that we've seen so far, Bash (and Korn Shell) offers extended globs, which have the expressive power of regular expressions. | ||
+ | |||
+ | Korn shell enables these by default; in Bash, you must run the command in your shell (or at the start of your script -- see note on parsing below) to use them. | ||
+ | |||
+ | <code bash> | ||
+ | shopt -s extglob | ||
+ | </ | ||
+ | |||
+ | The pattern matching reference describes the syntax, which is reproduced here: | ||
+ | |||
+ | <code bash> | ||
+ | ? | ||
+ | Matches zero or one occurrence of the given patterns. | ||
+ | *(pattern-list) | ||
+ | Matches zero or more occurrences of the given patterns. | ||
+ | +(pattern-list) | ||
+ | Matches one or more occurrences of the given patterns. | ||
+ | @(pattern-list) | ||
+ | Matches one of the given patterns. | ||
+ | !(pattern-list) | ||
+ | Matches anything except one of the given patterns. | ||
+ | </ | ||
+ | |||
+ | <WRAP info> | ||
+ | **NOTE: | ||
+ | </ | ||
+ | |||
+ | |||
+ | ---- | ||
+ | |||
+ | ==== Example of using Extended Globs ==== | ||
+ | |||
+ | Extended globs allow you to solve a number of problems which otherwise require a rather surprising amount of ugly hacking; for example, | ||
+ | |||
+ | <code bash> | ||
+ | # To remove all the files except ones matching *.jpg: | ||
+ | rm !(*.jpg) | ||
+ | |||
+ | # All except *.jpg and *.gif and *.png: | ||
+ | rm !(*.jpg|*.gif|*.png) | ||
+ | </ | ||
+ | |||
+ | or | ||
+ | |||
+ | <code bash> | ||
+ | # To copy all the MP3 songs except one to your device. | ||
+ | cp !(04*).mp3 /mnt | ||
+ | </ | ||
+ | |||
+ | ---- | ||
+ | |||
+ | ==== Extended Globs with Parameter Expansion ==== | ||
+ | |||
+ | To use an extglob in a parameter expansion (this can also be done in one BASH statement with read): | ||
+ | |||
+ | <code bash> | ||
+ | # To trim leading and trailing whitespace from a variable. | ||
+ | x=${x## | ||
+ | </ | ||
+ | |||
+ | ---- | ||
+ | |||
+ | ==== Nested Extended Glob Patterns ==== | ||
+ | |||
+ | Extended glob patterns can be nested, too. | ||
+ | |||
+ | <code bash> | ||
+ | [[ $fruit = @(ba*(na)|a+(p)le) ]] && echo "Nice fruit" | ||
+ | </ | ||
+ | |||
+ | |||
+ | |||
+ | <WRAP info> | ||
+ | **NOTE: | ||
+ | |||
+ | It is necessary to have a newline (not just a semicolon) between **shopt -s extglob** and any subsequent commands to use it. | ||
+ | |||
+ | You cannot enable extended globs inside a group command that uses them, because the entire block is parsed before the **shopt** is evaluated. | ||
+ | |||
+ | The typical function body is a group command. | ||
+ | </ | ||
+ | |||
+ | Therefore, if you use this option in a script, it is best put right under the shebang line. | ||
+ | |||
+ | <code bash> | ||
+ | # | ||
+ | shopt -s extglob | ||
+ | </ | ||
+ | |||
+ | If your code must be sourced and needs extglob, ensure it preserves the original setting from your shell: | ||
+ | |||
+ | <code bash> | ||
+ | # Remember whether extglob was originally set, so we know whether to unset it. | ||
+ | shopt -q extglob; extglob_set=$? | ||
+ | # Set extglob if it wasn't originally set. | ||
+ | ((extglob_set)) && shopt -s extglob | ||
+ | # Note, 0 (true) from shopt -q is " | ||
+ | |||
+ | # The basic concept behind the following is to delay parsing of the globs until evaluation. | ||
+ | # This matters at group commands, such as functions in { } blocks. | ||
+ | |||
+ | declare -a s='( !(x) )' | ||
+ | echo " | ||
+ | |||
+ | echo " | ||
+ | |||
+ | eval 'echo !(x)' | ||
+ | |||
+ | # Unset extglob if it wasn't originally set. | ||
+ | ((extglob_set)) && shopt -u extglob | ||
+ | |||
+ | This should also apply for other shell options. | ||
+ | </ | ||
+ | |||
+ | ---- | ||
+ | |||
+ | ===== Null Glob - nullglob ===== | ||
+ | |||
+ | nullglob expands non-matching globs to zero arguments, rather than to themselves. | ||
+ | |||
+ | <code bash> | ||
+ | $ ls *.c | ||
+ | ls: cannot access *.c: No such file or directory | ||
+ | |||
+ | # With nullglob set. | ||
+ | shopt -s nullglob | ||
+ | ls *.c | ||
+ | # Runs " | ||
+ | </ | ||
+ | |||
+ | Typically, nullglob is used to count the number of files matching a pattern: | ||
+ | |||
+ | <code bash> | ||
+ | shopt -s nullglob | ||
+ | files=(*) | ||
+ | echo "There are ${# | ||
+ | </ | ||
+ | |||
+ | Without nullglob, the glob would expand to a literal * in an empty directory, resulting in an erroneous count of 1. | ||
+ | |||
+ | ---- | ||
+ | |||
+ | ==== Null Blob BUG ==== | ||
+ | |||
+ | <WRAP important> | ||
+ | **WARNING: | ||
+ | |||
+ | It " | ||
+ | |||
+ | Removing array elements: | ||
+ | |||
+ | <code bash> | ||
+ | shopt -s nullglob | ||
+ | unset array[1] | ||
+ | #unsets nothing | ||
+ | |||
+ | unset -v " | ||
+ | #correct | ||
+ | </ | ||
+ | |||
+ | Array member assignments in compound form using subscripts: | ||
+ | |||
+ | <code bash> | ||
+ | shopt -s nullglob | ||
+ | array=([1]=*) | ||
+ | # Results in an empty array. | ||
+ | </ | ||
+ | |||
+ | This was reported as a [[http:// | ||
+ | </ | ||
+ | |||
+ | Apart from few builtins that use modified parsing under special conditions (e.g. declare) always use Quotes when arguments to simple commands could be interpreted as globs. | ||
+ | |||
+ | Enabling failglob, nullglob, or both during development and testing can help catch mistakes early. | ||
+ | |||
+ | To prevent pathname expansion occurring in unintended places, you can set **failglob**. | ||
+ | |||
+ | |||
+ | ---- | ||
+ | |||
+ | ==== Null Glob Portability ==== | ||
+ | |||
+ | "null globbing" | ||
+ | |||
+ | In portable scripts, you must explicitly check that a glob match was successful by checking that the files actually exist. | ||
+ | |||
+ | <code bash> | ||
+ | # POSIX | ||
+ | |||
+ | for x in *; do | ||
+ | [ -e " | ||
+ | ... | ||
+ | done | ||
+ | |||
+ | f() { | ||
+ | [ -e " | ||
+ | |||
+ | for x do | ||
+ | ... | ||
+ | done | ||
+ | } | ||
+ | |||
+ | f * || echo "No files found" | ||
+ | </ | ||
+ | |||
+ | Some modern POSIX-compatible shells allow null globbing as an extension. | ||
+ | |||
+ | <code bash> | ||
+ | # Bash | ||
+ | shopt -s nullglob | ||
+ | </ | ||
+ | |||
+ | In ksh93, there is no toggle-able option. | ||
+ | |||
+ | <code bash> | ||
+ | # ksh93 | ||
+ | |||
+ | for x in ~(N)*; do | ||
+ | ... | ||
+ | done | ||
+ | </ | ||
+ | |||
+ | In zsh, an toggle-able option(NULL_GLOB) or a glob qualifier(N) can be used. | ||
+ | |||
+ | <code bash> | ||
+ | # zsh | ||
+ | for x in *(N); do ...; done # or setopt NULL_GLOB | ||
+ | </ | ||
+ | |||
+ | mksh doesn' | ||
+ | |||
+ | ---- | ||
+ | |||
+ | ===== Dot Glob - dotglob ===== | ||
+ | |||
+ | By convention, a filename beginning with a dot is " | ||
+ | |||
+ | Globbing uses the same convention -- filenames beginning with a dot are not matched by a glob, unless the glob also begins with a dot. | ||
+ | |||
+ | Bash has a dotglob option that lets globs match "dot files": | ||
+ | |||
+ | <code bash> | ||
+ | shopt -s dotglob nullglob | ||
+ | files=(*) | ||
+ | echo "There are ${# | ||
+ | </ | ||
+ | |||
+ | It should be noted that when dotglob is enabled, * will match files like .bashrc but not the . or .. directories. | ||
+ | |||
+ | This is orthogonal to the problem of matching "just the dot files" -- a glob of .* will match . and .., typically causing problems. | ||
+ | |||
+ | ---- | ||
+ | |||
+ | ===== Glob Star - globstar ===== | ||
+ | |||
+ | (since bash 4.0-alpha) | ||
+ | |||
+ | globstar recursively repeats a pattern containing **< | ||
+ | |||
+ | <code bash> | ||
+ | shopt -s globstar; tree | ||
+ | . | ||
+ | ├── directory2 | ||
+ | │ | ||
+ | │ | ||
+ | │ | ||
+ | ├── file1 | ||
+ | └── file2.c | ||
+ | </ | ||
+ | |||
+ | Suppose that for the following examples. | ||
+ | |||
+ | Matching files: | ||
+ | |||
+ | <code bash> | ||
+ | $ files=(**) | ||
+ | # equivalent to: files=(* */* */*/*) | ||
+ | # finds all files recursively | ||
+ | |||
+ | $ files=(**/ | ||
+ | # equivalent to: files=(*.c */*.c */*/*.c) | ||
+ | # finds all *.c files recursively | ||
+ | # corresponds to: find -name " | ||
+ | # Caveat: **.c will not work, as it expands to *.c/*.c/… | ||
+ | </ | ||
+ | |||
+ | <WRAP info> | ||
+ | **NOTE: | ||
+ | |||
+ | <code bash> | ||
+ | shopt -u globstar | ||
+ | </ | ||
+ | |||
+ | See: **help shopt** for details. | ||
+ | </ | ||
+ | |||
+ | |||
+ | ---- | ||
+ | |||
+ | Assume you have a folder structure: | ||
+ | |||
+ | < | ||
+ | . | ||
+ | ├── bar | ||
+ | │ | ||
+ | │ | ||
+ | │ | ||
+ | │ | ||
+ | │ | ||
+ | │ | ||
+ | └── fnord.txt | ||
+ | </ | ||
+ | |||
+ | Then **ls** with single star **< | ||
+ | |||
+ | <code bash> | ||
+ | ls *.txt | ||
+ | fnord.txt | ||
+ | </ | ||
+ | |||
+ | The double star operator **< | ||
+ | |||
+ | <code bash> | ||
+ | ls **/*.txt | ||
+ | bar/ | ||
+ | </ | ||
+ | |||
+ | ---- | ||
+ | |||
+ | <WRAP info> | ||
+ | **NOTE: | ||
+ | </ | ||
+ | |||
+ | |||
+ | <code bash> | ||
+ | files=(**/) | ||
+ | # Finds all subdirectories. | ||
+ | |||
+ | files=(. **/) | ||
+ | # Finds all subdirectories, | ||
+ | # Corresponds to: find -type d. | ||
+ | </ | ||
+ | |||
+ | ---- | ||
+ | |||
+ | ===== Fail Glob - failglob ===== | ||
+ | |||
+ | If a pattern fails to match, bash reports an expansion error. | ||
+ | |||
+ | This can be useful at the commandline: | ||
+ | |||
+ | <code bash> | ||
+ | # Good at the command line! | ||
+ | $ > *.foo # creates file ' | ||
+ | $ shopt -s failglob | ||
+ | $ > *.foo # doesn' | ||
+ | -bash: no match: *.foo | ||
+ | </ | ||
+ | |||
+ | ---- | ||
+ | |||
+ | ===== GLOBIGNORE ===== | ||
+ | |||
+ | The Bash variable (not shopt) GLOBIGNORE allows you to specify patterns a glob should not match. | ||
+ | |||
+ | This lets you work around the infamous "I want to match all of my dot files, but not . or .." problem: | ||
+ | |||
+ | <code bash> | ||
+ | $ echo .* | ||
+ | . .. .bash_history .bash_logout .bashrc .inputrc .vimrc | ||
+ | $ GLOBIGNORE=.: | ||
+ | $ echo .* | ||
+ | .bash_history .bash_logout .bashrc .inputrc .vimrc | ||
+ | </ | ||
+ | |||
+ | ---- | ||
+ | |||
+ | ==== Unset GLOBIGNORE ==== | ||
+ | |||
+ | <code bash> | ||
+ | $ GLOBIGNORE= | ||
+ | $ echo .* | ||
+ | . .. .bash_history .bash_logout .bashrc .inputrc .vimrc | ||
+ | </ | ||
+ | |||
+ | |||
+ | ---- | ||
+ | |||
+ | ===== No Case Match - nocasematch ===== | ||
+ | |||
+ | Globs inside < | ||
+ | |||
+ | <code bash> | ||
+ | foo() { | ||
+ | local f r=0 nc=0 | ||
+ | shopt -q nocasematch && nc=1 || shopt -s nocasematch | ||
+ | for f; do | ||
+ | [[ $f = *.@(txt|jpg) ]] || continue | ||
+ | cmd -on " | ||
+ | done | ||
+ | ((nc)) || shopt -u nocasematch | ||
+ | return $r | ||
+ | } | ||
+ | </ | ||
+ | |||
+ | This is conventionally done this way: | ||
+ | |||
+ | <code bash> | ||
+ | case $f in | ||
+ | *.[Tt][Xx][Tt]|*.[Jj][Pp][Gg]) : ;; | ||
+ | *) continue | ||
+ | esac | ||
+ | </ | ||
+ | |||
+ | and in earlier versions of bash we'd use a similar glob: | ||
+ | |||
+ | <code bash> | ||
+ | [[ $f = *.@([Tt][Xx][Tt]|[Jj][Pp][Gg]) ]] || continue | ||
+ | </ | ||
+ | |||
+ | or with no extglob: | ||
+ | |||
+ | <code bash> | ||
+ | [[ $f = *.[Tt][Xx][Tt] ]] || [[ $f = *.[Jj][Pp][Gg] ]] || continue | ||
+ | </ | ||
+ | |||
+ | Here, one might keep the tests separate for maintenance; | ||
+ | |||
+ | Note also: | ||
+ | |||
+ | <code bash> | ||
+ | [[ $f = *.@([Tt][Xx][Tt]|[Jj][Pp]? | ||
+ | </ | ||
+ | |||
+ | Variants left as an exercise. | ||
+ | |||
+ | ---- | ||
+ | |||
+ | ===== No Case Glob - nocaseglob ===== | ||
+ | |||
+ | (since bash 2.02-alpha1) | ||
+ | |||
+ | This option makes pathname expansion case-insensitive. | ||
+ | |||
+ | In contrast, nocasematch operates on matches in < | ||
bash/globs.1575721912.txt.gz · Last modified: 2020/07/15 09:30 (external edit)