Skip to main content
Engineering LibreTexts

05-D.7.5: Handling Text Files - awk Command

  • Page ID
    32524
  • \( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)

    \( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)

    \( \newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\)

    ( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\)

    \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)

    \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\)

    \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)

    \( \newcommand{\Span}{\mathrm{span}}\)

    \( \newcommand{\id}{\mathrm{id}}\)

    \( \newcommand{\Span}{\mathrm{span}}\)

    \( \newcommand{\kernel}{\mathrm{null}\,}\)

    \( \newcommand{\range}{\mathrm{range}\,}\)

    \( \newcommand{\RealPart}{\mathrm{Re}}\)

    \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)

    \( \newcommand{\Argument}{\mathrm{Arg}}\)

    \( \newcommand{\norm}[1]{\| #1 \|}\)

    \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)

    \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\AA}{\unicode[.8,0]{x212B}}\)

    \( \newcommand{\vectorA}[1]{\vec{#1}}      % arrow\)

    \( \newcommand{\vectorAt}[1]{\vec{\text{#1}}}      % arrow\)

    \( \newcommand{\vectorB}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)

    \( \newcommand{\vectorC}[1]{\textbf{#1}} \)

    \( \newcommand{\vectorD}[1]{\overrightarrow{#1}} \)

    \( \newcommand{\vectorDt}[1]{\overrightarrow{\text{#1}}} \)

    \( \newcommand{\vectE}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash{\mathbf {#1}}}} \)

    \( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)

    \( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)

    \(\newcommand{\avec}{\mathbf a}\) \(\newcommand{\bvec}{\mathbf b}\) \(\newcommand{\cvec}{\mathbf c}\) \(\newcommand{\dvec}{\mathbf d}\) \(\newcommand{\dtil}{\widetilde{\mathbf d}}\) \(\newcommand{\evec}{\mathbf e}\) \(\newcommand{\fvec}{\mathbf f}\) \(\newcommand{\nvec}{\mathbf n}\) \(\newcommand{\pvec}{\mathbf p}\) \(\newcommand{\qvec}{\mathbf q}\) \(\newcommand{\svec}{\mathbf s}\) \(\newcommand{\tvec}{\mathbf t}\) \(\newcommand{\uvec}{\mathbf u}\) \(\newcommand{\vvec}{\mathbf v}\) \(\newcommand{\wvec}{\mathbf w}\) \(\newcommand{\xvec}{\mathbf x}\) \(\newcommand{\yvec}{\mathbf y}\) \(\newcommand{\zvec}{\mathbf z}\) \(\newcommand{\rvec}{\mathbf r}\) \(\newcommand{\mvec}{\mathbf m}\) \(\newcommand{\zerovec}{\mathbf 0}\) \(\newcommand{\onevec}{\mathbf 1}\) \(\newcommand{\real}{\mathbb R}\) \(\newcommand{\twovec}[2]{\left[\begin{array}{r}#1 \\ #2 \end{array}\right]}\) \(\newcommand{\ctwovec}[2]{\left[\begin{array}{c}#1 \\ #2 \end{array}\right]}\) \(\newcommand{\threevec}[3]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \end{array}\right]}\) \(\newcommand{\cthreevec}[3]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \end{array}\right]}\) \(\newcommand{\fourvec}[4]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}\) \(\newcommand{\cfourvec}[4]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}\) \(\newcommand{\fivevec}[5]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}\) \(\newcommand{\cfivevec}[5]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}\) \(\newcommand{\mattwo}[4]{\left[\begin{array}{rr}#1 \amp #2 \\ #3 \amp #4 \\ \end{array}\right]}\) \(\newcommand{\laspan}[1]{\text{Span}\{#1\}}\) \(\newcommand{\bcal}{\cal B}\) \(\newcommand{\ccal}{\cal C}\) \(\newcommand{\scal}{\cal S}\) \(\newcommand{\wcal}{\cal W}\) \(\newcommand{\ecal}{\cal E}\) \(\newcommand{\coords}[2]{\left\{#1\right\}_{#2}}\) \(\newcommand{\gray}[1]{\color{gray}{#1}}\) \(\newcommand{\lgray}[1]{\color{lightgray}{#1}}\) \(\newcommand{\rank}{\operatorname{rank}}\) \(\newcommand{\row}{\text{Row}}\) \(\newcommand{\col}{\text{Col}}\) \(\renewcommand{\row}{\text{Row}}\) \(\newcommand{\nul}{\text{Nul}}\) \(\newcommand{\var}{\text{Var}}\) \(\newcommand{\corr}{\text{corr}}\) \(\newcommand{\len}[1]{\left|#1\right|}\) \(\newcommand{\bbar}{\overline{\bvec}}\) \(\newcommand{\bhat}{\widehat{\bvec}}\) \(\newcommand{\bperp}{\bvec^\perp}\) \(\newcommand{\xhat}{\widehat{\xvec}}\) \(\newcommand{\vhat}{\widehat{\vvec}}\) \(\newcommand{\uhat}{\widehat{\uvec}}\) \(\newcommand{\what}{\widehat{\wvec}}\) \(\newcommand{\Sighat}{\widehat{\Sigma}}\) \(\newcommand{\lt}{<}\) \(\newcommand{\gt}{>}\) \(\newcommand{\amp}{&}\) \(\definecolor{fillinmathshade}{gray}{0.9}\)

    The awk Command

    The awk command programming language is a scripting language used for manipulating data and generating reports. It requires no compiling, and allows the user to use variables, numeric functions, string functions, and logical operators.

    awk is a utility that enables a programmer to write tiny but effective programs in the form of statements that define text patterns that are to be searched for in each line of a file, and the action that is to be taken when a match is found within a line. awk is mostly used for pattern scanning and processing. It searches one or more files to see if they contain lines that match with the specified patterns and then performs the associated actions.

    Syntax:

    awk options 'selection _criteria {action }' input-file > output-file
    Options Meaning
    awk '/regular expression/ {print}' filename awk will print any line that contains the specified regular expression
    awk 'relational expression {print}' filename awk allows users to look at individual fields with $1, $2 etc. Users can specify all the typical relational operators: ==, !=, >=, <=, <, >
    awk 'pattern1 && pattern2 {print}' filename the && is an AND - so BOTH patterns have to be true for awk to output the line
    awk 'pattern1 || pattern2 {print}' filename the || is an OR - so ONE pattern has to be true for awk to output the line
    awk '{print pattern1 ? option1 : option2 }' filename if pattern1 is TRUE, then awk prints option1, else it prints option2
    awk 'pattern1, pattern2 {print}' filename if pattern1 is true, the awk prints the line until pattern2 is true
    # if the line contains a 'y' or a 'z' - then print that line
    pbmac@pbmac-server $ awk '/[yz]/ {print}' state.txt 
    Arizona
    Wyoming
    
    # if $2 - that is the second field (space is the separator)- is Dakota, then print that line
    pbmac@pbmac-server $ awk '$2 == "Dakota" {print}' state.txt 
    South Dakota
    
    # if the firsts field is South, and the second field is Dakota, then print that line
    pbmac@pbmac-server $ awk '$1 == "South" && $2 == "Dakota" {print}' state.txt 
    South Dakota
    
    # if the fisrt field is South OR Wyoming - then print that line
    pbmac@pbmac-server $ awk '$1 == "South" || $1 == "Wyoming" {print}' state.txt 
    South Dakota
    Wyoming
    
    # if the first field is South, then print "Abbreviation is $4" - where $4 is the fourth field on that line ELSE print 
    # "Capital is $2" - where $2 is the second field on that line.
    pbmac@pbmac-server $ awk '{print ($1 == "South") ? "Abbrevition is " $4 : "Capital is " $2}' stcap.txt 
    Capital is Helena
    Capital is Phoenix
    Abbrevition is SD
    Capital is Jefferson
    Capital is Juneau
    Capital is Cheyenne
    Capital is Carson
    
    # if the first field is Montana - then print all lines until the third field of the input is Pierre
    pbmac@pbmac-server $ awk '$1 == "Montana", $3 == "Pierre" {print}' stcap.txt 
    Montana Helena MT
    Arizona Phoenix AZ
    South Dakota Pierre SD
    

    Adapted from:
    "gawk command in Linux with Examples" by Mandeep Sheoran, Geeks for Geeks is licensed under BY-SA 4.0


    05-D.7.5: Handling Text Files - awk Command is shared under a CC BY-SA 4.0 license and was authored, remixed, and/or curated by LibreTexts.

    • Was this article helpful?