====== AWK - AWK Variables ======
AWK provides several built-in variables.
They play an important role while writing AWK scripts.
Here is a list of some of the built-in variables supported by AWK:
^Variable^Details^
|ARGC|The number of command line arguments.|
|ARGIND|The index in ARGV of the current file being processed.|
|ARGV|Array of command line arguments. The array is indexed from 0 to ARGC - 1.|
|BINMODE|Specifies use of “binary” mode for all file I/O.|
|CONVFMT|The conversion format for numbers, "%.6g", by default.|
|ENVIRON|Environment Variables.|
|FIELDWIDTHS|A white-space separated list of field widths. When set the input is parsed into fields of fixed width, instead of using the value of the FS variable as the field separator.|
|FILENAME|Current filename.|
|FNR|The input record number in the current input file.|
|FPAT|A regular expression describing the contents of the fields in a record. When set the input is parsed into fields, where the fields match the regular expression, instead of using the value of the FS variable as the field separator.|
|FS|The input field separator, a space by default.|
|IGNORECASE|Controls the case-sensitivity of all regular expression and string operations.|
|LINT|Provides dynamic control of the –lint option.|
|NF|The number of fields in the current input record.|
|NR|The total number of input records seen so far.|
|OFMT|The output format for numbers, "%.6g", by default.|
|OFS|The output field separator, a space by default.|
|ORS|The output record separator, by default a newline.|
|PROCINFO|Information about the process.|
|RLENGTH|The length of the string matched by the match function.|
|RSTART|The first position in the string matched by match function.|
|RS|The input record separator, by default a newline.|
|RT|The record terminator.|
|SUBSEP|The character used to separate multiple subscripts in array elements, by default "\034".|
|TEXTDOMAIN|The text domain of the AWK program. For compatibility with GNU **gettext**. The default value is "messages".|
----
===== ARGC =====
It implies the number of arguments provided at the command line.
awk 'BEGIN {print "Arguments =", ARGC}' One Two Three Four
returns:
Arguments = 5
**NOTE:** The number of arguments reported is one more than the variables passed; as this also includes the calling program as the first argument.
----
===== ARGIND =====
It represents the index in ARGV of the current file being processed.
awk '{
print "ARGIND = ", ARGIND; print "Filename = ", ARGV[ARGIND]
}' junk1 junk2 junk3
returns:
ARGIND = 1
Filename = junk1
ARGIND = 2
Filename = junk2
ARGIND = 3
Filename = junk3
----
===== ARGV =====
It is an array that stores the command-line arguments.
The array's valid index ranges from 0 to ARGC-1.
awk 'BEGIN {
for (i = 0; i < ARGC - 1; ++i) {
printf "ARGV[%d] = %s\n", i, ARGV[i]
}
}' one two three four
returns:
ARGV[0] = awk
ARGV[1] = one
ARGV[2] = two
ARGV[3] = three
----
===== BINMODE =====
It is used to specify binary mode for all file I/O on non-POSIX systems.
Numeric values of 1, 2, or 3 specify that input files, output files, or all files, respectively, should use binary I/O.
String values of **r** or **w** specify that input files or output files, respectively, should use binary I/O.
String values of **rw** or **wr** specify that all files should use binary I/O.
----
===== CONVFMT =====
It represents the conversion format for numbers.
Its default value is %.6g.
awk 'BEGIN { print "Conversion Format =", CONVFMT }'
returns:
Conversion Format = %.6g
----
===== ENVIRON =====
It is an associative array of environment variables.
awk 'BEGIN { print ENVIRON["USER"] }'
returns:
peter
**NOTE:** To find names of other environment variables, use the **env** command.
----
===== ERRNO =====
A string indicates an error when a redirection fails for getline or if a close call fails.
awk 'BEGIN { ret = getline < "junk.txt"; if (ret == -1) print "Error:", ERRNO }'
returns:
Error: No such file or directory
----
===== FIELDWIDTHS =====
A space separated list of field widths variable is set, GAWK parses the input into fields of fixed width, instead of using the value of the FS variable as the field separator.
----
===== FILENAME =====
It represents the current file name.
awk 'END {print FILENAME}' test.txt
returns:
test.txt
**NOTE:** Please note that FILENAME is undefined in the BEGIN block.
----
===== FNR =====
It is similar to NR, but relative to the current file.
It is useful when AWK is operating on multiple files.
It represents the number of the current record in the current file.
echo -e "One Two\nOne Two Three\nOne Two Three Four" | awk 'FNR < 3'
returns:
One Two
One Two Three
**NOTE:** The Value of FNR resets with a new file.
----
===== FS =====
It represents the (input) field separator and its default value is space.
You can also change this by using **-F** command line option.
awk 'BEGIN {print "FS = " FS}' | cat -vte
returns:
FS = $
----
===== IGNORECASE =====
When this variable is set, GAWK becomes case-insensitive.
awk 'BEGIN{IGNORECASE = 1} /peter/' test.txt
returns:
10 Peter Terence Roux 45
**NOTE:** If IGNORECASE does not work then try:
awk 'tolower($0) ~ /peter/' test.txt
----
===== LINT =====
It provides dynamic control of the **--lint** option from the GAWK program.
When this variable is set, GAWK prints lint warnings.
When assigned the string value fatal, lint warnings become fatal errors, exactly like **--lint=fatal**.
awk 'BEGIN {LINT = 1; a}'
returns:
awk: cmd. line:1: warning: reference to uninitialized variable `a'
awk: cmd. line:1: warning: statement has no effect
**NOTE:** This only works with GAWK, not AWK.
----
===== NF =====
It represents the number of fields in the current record.
For instance, the following example prints only those lines that contain more than two fields.
echo -e "One Two\nOne Two Three\nOne Two Three Four" | awk 'NF > 2'
returns:
One Two Three
One Two Three Four
----
===== NR =====
It represents the number of the current record.
For instance, the following example prints the record if the current record number is less than three.
echo -e "One Two\nOne Two Three\nOne Two Three Four" | awk 'NR < 3'
returns:
One Two
One Two Three
----
===== OFMT =====
It represents the output format number and its default value is %.6g.
awk 'BEGIN {print "OFMT = " OFMT}'
returns:
OFMT = %.6g
----
===== OFS =====
It represents the output field separator and its default value is space.
awk 'BEGIN {print "OFS = " OFS}' | cat -vte
returns:
OFS = $
----
===== ORS =====
It represents the output record separator and its default value is newline.
awk 'BEGIN {print "ORS = " ORS}' | cat -vte
returns:
ORS = $
$
----
===== PROCINFO =====
This is an associative array containing information about the process, such as real and effective UID numbers, process ID number, and so on.
awk 'BEGIN { print PROCINFO["pid"] }'
returns:
30510
----
===== RLENGTH =====
It represents the length of the string matched by match function.
AWK's match function searches for a given string in the input-string.
awk 'BEGIN { if (match("One Two Three", "re")) { print RLENGTH } }'
returns:
2
----
===== RS =====
It represents (input) record separator and its default value is newline.
awk 'BEGIN {print "RS = " RS}' | cat -vte
returns:
RS = $
$
----
===== RSTART =====
It represents the first position in the string matched by match function.
awk 'BEGIN { if (match("One Two Three", "Thre")) { print RSTART } }'
returns:
9
----
===== SUBSEP =====
It represents the separator character for array subscripts and its default value is \034.
awk 'BEGIN { print "SUBSEP = " SUBSEP }' | cat -vte
returns:
SUBSEP = ^\$
----
===== $0 =====
It represents the entire input record.
awk '{print $0}' test.txt
returns:
10 Peter Terence Roux 45
11 Virginia Genevieve Roux 45
12 Felix Devon Roux 5
13 David Bruce Stevenson 48
14 Bob James Smith 16
48 Adam Winter Ridley 23
----
===== TEXTDOMAIN =====
It represents the text domain of the AWK program.
It is used to find the localized translations for the program's strings.
awk 'BEGIN { print TEXTDOMAIN }'
returns:
messages
**NOTE:** The above output shows English text due to en_GB locale.
----
===== $n =====
It represents the nth field in the current record where the fields are separated by FS.
awk '{print $2 "\t" $5}' test.txt
returns:
Peter 45
Virginia 45
Felix 5
David 48
Bob 16
Adam 23
----