Awk (English) Cheat Sheet

Download as pdf or txt
Download as pdf or txt
You are on page 1of 5

awk (english) Cheat Sheet

by TME520 (TME520) via cheatography.com/20978/cs/3902/

Usage Numeric Functions (cont)

awk [-v var=val] 'program' [file1 file2...] sin(expr) Returns the sine of expr, which is in radians.
awk [-v var=val] -f progfile [file1 file2...] sqrt(expr) The square root function.
srand([expr]) Uses expr as a new seed for the random number
Predefined Variable Summary
generator. If no expr is provided, the time of day is
FS Input Field Separator, a space by default. used. The return value is the previous seed for the
OFS Output Field Separator, a space by default. random number generator.

RS Record Separator, a newline by default.


Bit Manipu​lation Functions
ORS Output Record Separator, a newline by default.
and(v1, v2) Return the bitwise AND of the values provided by
NR The total Number of input Records seen so far.
v1 and v2.
NF The Number of Fields in the current input record.
compl(val) Return the bitwise complement of val.
FILENAME The name of the current input file. If no files are
lshift(val, count) Return the value of val, shifted left by count bits.
specified on the command line, the value of FILENAME
is "​-". However, FILENAME is undefined inside the or(v1, v2) Return the bitwise OR of the values provided by
BEGIN block (unless set by getline). v1 and v2.

FNR Contains number of lines read, but is reset for each file rshift(val, count) Return the value of val, shifted right by count bits.
read. xor(v1, v2) Return the bitwise XOR of the values provided by
$0 The whole line. v1 and v2.

$1, $2...$n Fields from 1 to n.


String Functions
ARGC The number of command line arguments (does not
asort(s [, d]) Returns the number of elements in the source array s.
include options to gawk, or the program source).
The contents of s are sorted using gawk's normal
Dynami​cally changing the contents of ARGV control
rules for comparing values, and the indexes of the
the files used for data.
sorted values of s are replaced with sequential
ARGV Array of command line arguments. The array is
integers starting with 1. If the optional destin​ation
indexed from 0 to ARGC - 1.
array d is specified, then s is first duplicated into d,
ARGIND The index in ARGV of the current file being processed. and then d is sorted, leaving the indexes of the source
array s unchanged.
Numeric Functions

atan2(y, x) Returns the arctangent of y/x in radians.

cos(expr) Returns the cosine of expr, which is in radians.


exp(expr) The expone​ntial function.
int(expr) Truncates to integer.
log(expr) The natural logarithm function.
rand() Returns a random number N, between 0 and 1, such
that 0 <= N < 1.

By TME520 (TME520) Published 23rd April, 2015. Sponsored by Readable.com


cheatography.com/tme520/ Last updated 12th May, 2016. Measure your website readability!
tme520.com Page 1 of 5. https://readable.com
awk (english) Cheat Sheet
by TME520 (TME520) via cheatography.com/20978/cs/3902/

String Functions (cont) String Functions (cont)

asorti(s [, d]) Returns the number of elements in the source index(s, t) Returns the index of the string t in the string s,
array s. The behavior is the same as that of or 0 if t is not present (this implies that
asort(), except that the array indices are used character indices start at one).
for sorting, not the array values. When done, length([s]) Returns the length of the string s, or the
the array is indexed numeri​cally, and the values length of $0 if s is not supplied.
are those of the original indices. The original
match(s, r [, a]) Returns the position in s where the regular
values are lost; thus provide a second array if
expression r occurs, or 0 if r is not present,
you wish to preserve the original.
and sets the values of RSTART and
gensub(r, s, h [, t]) Search the target string t for matches of the RLENGTH. Note that the argument order is
regular expression r. If h is a string beginning the same as for the ~ operator: str ~ re. If
with g or G, then replace all matches of r with s. array a is provided, a is cleared and then
Otherwise, h is a number indicating which elements 1 through n are filled with the
match of r to replace. If t is not supplied, $0 is portions of s that match the corres​ponding
used instead. Within the replac​ement text s, the parent​hesized subexp​ression in r. The 0'th
sequence \n, where n is a digit from 1 to 9, may element of a contains the portion of s
be used to indicate just the text that matched matched by the entire regular expression r.
the n'th parent​hesized subexp​res​sion. The Subscripts a[n, "​sta​rt"], and a[n, "​len​gth​"]
sequence \0 represents the entire matched text, provide the starting index in the string and
as does the character &. Unlike sub() and length respec​tively, of each matching
gsub(), the modified string is returned as the substring.
result of the function, and the original target
split(s, a [, r]) Splits the string s into the array a on the
string is not changed.
regular expression r, and returns the number
gsub(r, s [, t]) For each substring matching the regular of fields. If r is omitted, FS is used instead.
expression r in the string t, substitute the string The array a is cleared first. Splitting behaves
s, and return the number of substi​tut​ions. If t is identi​cally to field splitting.
not supplied, use $0. An & in the replac​ement
sprintf(fmt, expr-list) Prints expr-list according to fmt, and returns
text is replaced with the text that was actually
the resulting string.
matched. Use \& to get a literal & (This must be
typed as "​\\&").

By TME520 (TME520) Published 23rd April, 2015. Sponsored by Readable.com


cheatography.com/tme520/ Last updated 12th May, 2016. Measure your website readability!
tme520.com Page 2 of 5. https://readable.com
awk (english) Cheat Sheet
by TME520 (TME520) via cheatography.com/20978/cs/3902/

String Functions (cont) I/O Statements (cont)

strtonum(str) Examines str, and returns its numeric value. If str getline <file Set $0 from next record of file; set NF.
begins with a leading 0, strtonum() assumes that str getline var Set var from next input record; set NR,
is an octal number. If str begins with a leading 0x or FNR.
0X, strtonum() assumes that str is a hexade​cimal
getline var <file Set var from next record of file.
number.
command | getline [var] Run command piping the output either
sub(r, s [, t]) Just like gsub(), but only the first matching
into $0 or var, as above. If using a pipe
substring is replaced.
or co-process to getline, or from print or
substr(s, i [, n]) Returns the at most n-char​acter substring of s printf within a loop, you must use close()
starting at i. If n is omitted, the rest of s is used. to create new instances.
tolower(str) Returns a copy of the string str, with all the upper- command |& getline [var] Run command as a co-process piping the
case characters in str translated to their corres​‐ output either into $0 or var, as above.
ponding lower-case counte​rparts. Non-al​pha​betic Co-pro​cesses are a gawk extension.
characters are left unchanged.
next Stop processing the current input record.
toupper(str) Returns a copy of the string str, with all the lower- The next input record is read and
case characters in str translated to their corres​‐ processing starts over with the first
ponding upper-case counte​rparts. Non-al​pha​betic pattern in the AWK program. If the end of
characters are left unchanged. the input data is reached, the END
block(s), if any, are executed.
Operators
nextfile Stop processing the current input file.
&& || ! Logical operators : AND, OR, NOT The next input record read comes from
< <= == != >= > ~ !~ Comparison operators. the next input file. FILENAME and
ARGIND are updated, FNR is reset to 1,
I/O Statements and processing starts over with the first
pattern in the AWK program. If the end of
close(file [, how]) Close file, pipe or co-pro​cess. The optional how
the input data is reached, the END
should only be used when closing one end of a
block(s), are executed.
two-way pipe to a co-pro​cess. It must be a string
value, either "​to" or "​fro​m". print Prints the current record. The output
record is terminated with the value of the
getline Set $0 from next input record; set NF, NR, FNR.
ORS variable.
Returns 0 on EOF and 1 on an error. Upon an
error, ERRNO contains a string describing the print expr-list Prints expres​sions. Each expression is
problem. separated by the value of the OFS
variable. The output record is terminated
with the value of the ORS variable.

By TME520 (TME520) Published 23rd April, 2015. Sponsored by Readable.com


cheatography.com/tme520/ Last updated 12th May, 2016. Measure your website readability!
tme520.com Page 3 of 5. https://readable.com
awk (english) Cheat Sheet
by TME520 (TME520) via cheatography.com/20978/cs/3902/

I/O Statements (cont) GNU AWK's Command Line Argument Summary

print expr-list >file Prints expres​sions on file. Each expression -F fs or --field-sepearator fs Use fs for the
is separated by the value of the OFS input field
variable. The output record is terminated separator (the
with the value of the ORS variable. value of the
printf fmt, expr-list Format and print. FS predefined
variable).
printf fmt, expr-list >file Format and print on file.
-v var=val or --assign var=val Assign the
system(cmd-line) Execute the command cmd-line, and return
value val to
the exit status.
the variable
fflush([file]) Flush any buffers associated with the open
var, before
output file or pipe file. If file is missing, then
execution of
stdout is flushed. If file is the null string,
the program
then all open output files and pipes have
begins. Such
their buffers flushed.
variable
print ... >> file Appends output to the file. values are
print ... | command Writes on a pipe. available to
the BEGIN
print ... |& command Sends data to a co-pro​cess.
block of an
AWK
Time Functions
program.
systime() Returns the current time of day as
-f program-file or --file program-file Read the
the number of seconds since the
AWK program
Epoch (1970-​01-01 00:00:00 UTC
source from
on POSIX systems).
the file progra​‐
mktime​(da​tespec) Turns datespec into a time stamp of
m-file, instead
the same form as returned by
of from the
systime(). The datespec is a string
first command
of the form YYYY MM DD HH MM
line argument.
SS[ DST].
Multiple -f (or -
strftime([format [, timestamp]]) Formats timestamp according to the -file) options
specif​ication in format. The may be used.
timestamp should be of the same
-mf NNN or -mr NNN Set various
form as returned by systime(). If
memory limits
timestamp is missing, the current
to the value
time of day is used.If format is
NNN. The f
missing, a default format equivalent
flag sets the
to the output of date(1) is used.
maximum
number of
fields, and the
r flag sets the
maximum
record size
(ignored by
gawk, since
gawk has no
pre-de​fined
limits).
-W compat or -W traditional or --compat--traditional Run in
compat​ibility
mode. In
compat​ibility
mode, gawk
behaves
identi​cally to
UNIX awk;
none of the
GNU-sp​ecific
extensions
are recogn​‐
ized.

-W dump-variables[=file] or --dump-variables[=file] Print a sorted


list of global
variables, their
types and final
values to file.
If no file is
provided,
gawk uses a
file named
awkvar​s.out in
the current
directory.
-W help or -W usage or --help or --usage Print a
relatively short
summary of
the available
options on the
standard
output.

By TME520 (TME520) Published 23rd April, 2015. Sponsored by Readable.com


cheatography.com/tme520/ Last updated 12th May, 2016. Measure your website readability!
tme520.com Page 4 of 5. https://readable.com

You might also like