bash:files:read_a_file
Differences
This shows you the differences between two versions of the page.
Both sides previous revisionPrevious revisionNext revision | Previous revision | ||
bash:files:read_a_file [2021/01/26 13:20] – peter | bash:files:read_a_file [2021/01/26 14:19] (current) – [How to keep other commands from "eating" the input] peter | ||
---|---|---|---|
Line 7: | Line 7: | ||
[[BASH: | [[BASH: | ||
- | ---- | + | [[BASH: |
- | ===== Basic read ===== | + | [[BASH: |
- | <code bash> | + | [[BASH: |
- | while read -r line; do | + | |
- | printf ' | + | |
- | done < "$file" | + | |
- | </ | + | |
- | <WRAP info> | + | [[BASH:Files:Read a file:Read from an interactive shell|Read from an interactive shell]] |
- | **NOTE:** This reads each line of the file into the **line** variable. | + | |
- | * **line**: is a variable name, chosen by you. | + | [[BASH:Files:Read a file:Skip Reading Comments|Skip Reading Comments]] |
- | * **-r**: | + | |
- | * Without this option, any unescaped backslashes in the input will be discarded. | + | |
- | * You should almost always use the **-r** option with read. | + | |
- | * **< "$file"**: The file to read. | + | |
- | </ | + | |
- | ---- | + | [[BASH: |
- | ===== Prevent removal of leading and trailing white-space characters ===== | ||
- | |||
- | <code bash> | ||
- | while IFS= read -r line; do | ||
- | printf ' | ||
- | done < " | ||
- | </ | ||
- | |||
- | <WRAP info> | ||
- | **NOTE: | ||
- | |||
- | The **IFS** (internal field separator) is often set to support reads. | ||
- | |||
- | * **IFS= **: By default, read modifies each line read, by removing all leading and trailing white-space characters (spaces and tabs, if present in IFS, which is the default). | ||
- | * To prevent this, the IFS variable is cleared. | ||
- | * **line**: | ||
- | * **-r**: | ||
- | * Without this option, any unescaped backslashes in the input will be discarded. | ||
- | * You should almost always use the **-r** option with read. | ||
- | * **< " | ||
- | |||
- | </ | ||
---- | ---- | ||
- | ====== Read fields from a file ====== | ||
- | |||
- | To read fields within each line of the file, additional variables may be used with the read: | ||
- | |||
- | For instance, if an input file has 3 columns separated by white space (space or tab characters only). | ||
- | |||
- | <code bash> | ||
- | while read -r first_name last_name phone; do | ||
- | # Only print the last name (second column). | ||
- | printf ' | ||
- | done < " | ||
- | </ | ||
- | |||
- | ---- | ||
- | |||
- | If the field delimiters are not whitespace, set the IFS (internal field separator): | ||
- | |||
- | <code bash> | ||
- | # Extract the username and its shell from / | ||
- | while IFS=: read -r user pass uid gid gecos home shell; do | ||
- | printf '%s: %s\n' " | ||
- | done < /etc/passwd | ||
- | </ | ||
- | |||
- | <WRAP info> | ||
- | **NOTE: | ||
- | |||
- | For tab-delimited files, use **IFS=$' | ||
- | |||
- | You do not necessarily need to know how many fields each line of input contains. | ||
- | |||
- | * If you supply more variables than there are fields, the extra variables will be empty. | ||
- | * If you supply fewer, the last variable gets "all the rest" of the fields after the preceding ones are satisfied. | ||
- | |||
- | For example: | ||
- | |||
- | <code bash> | ||
- | read -r first last junk <<< | ||
- | </ | ||
- | |||
- | * **first**: | ||
- | * **last**: | ||
- | * **junk**: | ||
- | |||
- | The throwaway variable **_** can be used as a "junk variable" | ||
- | |||
- | * It (or indeed any variable) can also be used more than once in a single read command, if we don't care what goes into it: | ||
- | |||
- | <code bash> | ||
- | read -r _ _ first middle last _ <<< | ||
- | </ | ||
- | |||
- | * We skip the first two fields, then read the next three. | ||
- | * The final **_** can absorb any number of fields. | ||
- | * It does not need to be repeated there. | ||
- | |||
- | This usage of **_** is only guaranteed to work in Bash. | ||
- | |||
- | * Many other shells use _ for other purposes that will at best cause this to not have the desired effect, and can break the script entirely. | ||
- | * It is better to choose a unique variable that isn't used elsewhere in the script, even though _ is a common Bash convention. | ||
- | |||
- | </ | ||
- | |||
- | |||
- | ---- | ||
- | |||
- | ====== Field splitting, white-space trimming, and other input processing ====== | ||
- | |||
- | When not to use the **-r** option: | ||
- | |||
- | * **-r**: | ||
- | * Without this option, any unescaped backslashes in the input will be discarded. | ||
- | * You should almost always use the **-r** option with read. | ||
- | |||
- | The most common exception to this rule is when **-e** is used, which uses Readline to obtain the line from an interactive shell. | ||
- | |||
- | * In that case, tab completion will add backslashes to escape spaces and such, and you do not want them to be literally included in the variable. | ||
- | * This would never be used when reading anything line-by-line, | ||
- | |||
- | |||
- | ---- | ||
- | |||
- | ====== Skip Reading Comments ====== | ||
- | |||
- | To avoid reading comments starting with **#** simply skip them inside the loop: | ||
- | |||
- | <code bash> | ||
- | while read -r line; do | ||
- | [[ $line = \#* ]] && continue | ||
- | printf ' | ||
- | done < " | ||
- | </ | ||
- | |||
- | ---- | ||
- | |||
- | ====== Input source selection ====== | ||
- | |||
- | The redirection < " | ||
- | |||
- | If your input source is the contents of a variable/ | ||
- | |||
- | <code bash> | ||
- | while IFS= read -r line; do | ||
- | printf ' | ||
- | done <<< | ||
- | </ | ||
- | |||
- | ---- | ||
- | |||
- | The same can be done in any Bourne-type shell by using a "here document" | ||
- | |||
- | <code bash> | ||
- | while IFS= read -r line; do | ||
- | printf ' | ||
- | done <<EOF | ||
- | $var | ||
- | EOF | ||
- | </ | ||
- | |||
- | ---- | ||
- | |||
- | ====== Read from a command instead of a regular file ====== | ||
- | |||
- | <code bash> | ||
- | some command | while IFS= read -r line; do | ||
- | printf ' | ||
- | done | ||
- | </ | ||
- | |||
- | ---- | ||
- | |||
- | This method is especially useful for processing the output of find with a block of commands: | ||
- | |||
- | <code bash> | ||
- | find . -type f -print0 | while IFS= read -r -d '' | ||
- | mv " | ||
- | done | ||
- | </ | ||
- | |||
- | **NOTE: | ||
- | |||
- | * **-print0**: | ||
- | * **-d '' | ||
- | * By default, **find** and **read** delimit their input with newlines; however, since filenames can potentially contain newlines themselves, this default behavior will split up those filenames at the newlines and cause the loop body to fail. | ||
- | * **IFS= **: Set to an empty string, because otherwise read would still strip leading and trailing whitespace. | ||
- | * **|**: | ||
- | * This places the loop in a "sub shell", | ||
- | * To avoid that, you may use a ProcessSubstitution: | ||
- | |||
- | |||
- | |||
- | ---- | ||
- | |||
- | <code bash> | ||
- | while IFS= read -r line; do | ||
- | printf ' | ||
- | done < <(some command) | ||
- | </ | ||
- | |||
- | |||
- | ---- | ||
- | |||
- | ===== My text files are broken! | ||
- | |||
- | If there are some characters after the last line in the file (or to put it differently, | ||
- | |||
- | <code bash> | ||
- | # Emulate cat | ||
- | while IFS= read -r line; do | ||
- | printf ' | ||
- | done < " | ||
- | [[ -n $line ]] && printf %s " | ||
- | </ | ||
- | |||
- | or: | ||
- | |||
- | <code bash> | ||
- | # This does not work: | ||
- | printf 'line 1\ntruncated line 2' | while read -r line; do echo $line; done | ||
- | |||
- | # This does not work either: | ||
- | printf 'line 1\ntruncated line 2' | while read -r line; do echo " | ||
- | |||
- | # This works: | ||
- | printf 'line 1\ntruncated line 2' | { while read -r line; do echo " | ||
- | </ | ||
- | |||
- | The first example, beyond missing the after-loop test, is also missing quotes. See Quotes or Arguments for an explanation why. The Arguments page is an especially important read. | ||
- | |||
- | For a discussion of why the second example above does not work as expected, see FAQ #24. | ||
- | |||
- | ---- | ||
- | |||
- | Alternatively, | ||
- | |||
- | <code bash> | ||
- | while IFS= read -r line || [[ -n $line ]]; do | ||
- | printf ' | ||
- | done < " | ||
- | |||
- | printf 'line 1\ntruncated line 2' | while read -r line || [[ -n $line ]]; do echo " | ||
- | </ | ||
- | |||
- | ---- | ||
- | |||
- | ===== How to keep other commands from " | ||
- | |||
- | Some commands greedily eat up all available data on standard input. | ||
- | |||
- | <code bash> | ||
- | while read -r line; do | ||
- | cat > ignoredfile | ||
- | printf ' | ||
- | done < " | ||
- | </ | ||
- | |||
- | will only print the contents of the first line, with the remaining contents going to " | ||
- | |||
- | ---- | ||
- | |||
- | One workaround is to use a numeric FileDescriptor rather than standard input: | ||
- | |||
- | <code bash> | ||
- | # Bash | ||
- | while IFS= read -r -u 9 line; do | ||
- | cat > ignoredfile | ||
- | printf ' | ||
- | done 9< " | ||
- | |||
- | # Note that read -u is not portable to every shell. Use a redirect to ensure it works in any POSIX compliant shell: | ||
- | while IFS= read -r line <&9; do | ||
- | cat > ignoredfile | ||
- | printf ' | ||
- | done 9< " | ||
- | </ | ||
- | |||
- | or: | ||
- | <code bash> | ||
- | exec 9< " | ||
- | while IFS= read -r line <&9; do | ||
- | cat > ignoredfile | ||
- | printf ' | ||
- | done | ||
- | exec 9<&- | ||
- | </ | ||
- | This example will wait for the user to type something into the file ignoredfile at each iteration instead of eating up the loop input. | ||
- | You might need this, for example, with mencoder which will accept user input if there is any, but will continue silently if there isn' | ||
bash/files/read_a_file.1611667235.txt.gz · Last modified: 2021/01/26 13:20 by peter