# 05-D.7.5: Handling Text Files - awk Command

## The awk Command

The awk command programming language is a scripting language used for manipulating data and generating reports. It requires no compiling, and allows the user to use variables, numeric functions, string functions, and logical operators.

awk is a utility that enables a programmer to write tiny but effective programs in the form of statements that define text patterns that are to be searched for in each line of a file, and the action that is to be taken when a match is found within a line. awk is mostly used for pattern scanning and processing. It searches one or more files to see if they contain lines that match with the specified patterns and then performs the associated actions.

Syntax:

awk options 'selection _criteria {action }' input-file > output-file
Options Meaning
awk '/regular expression/ {print}' filename awk will print any line that contains the specified regular expression
awk 'relational expression {print}' filename awk allows users to look at individual fields with $1,$2 etc. Users can specify all the typical relational operators: ==, !=, >=, <=, <, >
awk 'pattern1 && pattern2 {print}' filename the && is an AND - so BOTH patterns have to be true for awk to output the line
awk 'pattern1 || pattern2 {print}' filename the || is an OR - so ONE pattern has to be true for awk to output the line
awk '{print pattern1 ? option1 : option2 }' filename if pattern1 is TRUE, then awk prints option1, else it prints option2
awk 'pattern1, pattern2 {print}' filename if pattern1 is true, the awk prints the line until pattern2 is true
# if the line contains a 'y' or a 'z' - then print that line
pbmac@pbmac-server $awk '/[yz]/ {print}' state.txt Arizona Wyoming # if$2 - that is the second field (space is the separator)- is Dakota, then print that line
pbmac@pbmac-server $awk '$2 == "Dakota" {print}' state.txt
South Dakota

# if the firsts field is South, and the second field is Dakota, then print that line
pbmac@pbmac-server $awk '$1 == "South" && $2 == "Dakota" {print}' state.txt South Dakota # if the fisrt field is South OR Wyoming - then print that line pbmac@pbmac-server$ awk '$1 == "South" ||$1 == "Wyoming" {print}' state.txt
South Dakota
Wyoming

# if the first field is South, then print "Abbreviation is $4" - where$4 is the fourth field on that line ELSE print
# "Capital is $2" - where$2 is the second field on that line.
pbmac@pbmac-server $awk '{print ($1 == "South") ? "Abbrevition is " $4 : "Capital is "$2}' stcap.txt
Capital is Helena
Capital is Phoenix
Abbrevition is SD
Capital is Jefferson
Capital is Juneau
Capital is Cheyenne
Capital is Carson

# if the first field is Montana - then print all lines until the third field of the input is Pierre
pbmac@pbmac-server $awk '$1 == "Montana", \$3 == "Pierre" {print}' stcap.txt
Montana Helena MT
Arizona Phoenix AZ
South Dakota Pierre SD