The sort Command
The sort command is used to sort a file, arranging the records in a particular order. By default, the sort command sorts a file assuming the contents are ASCII. Using options in sort command, it can also be used to sort numerically.
Some features of the command are as follows:
- sort command sorts the contents of a text file, line by line.
- sort is a standard command line program that prints the lines of its input or concatenation of all files listed in its argument list in sorted order.
- The sort command is a command line utility for sorting lines of text files. It supports sorting alphabetically, in reverse order, by number, by month and can also remove duplicates.
- The sort command can also sort by items not at the beginning of the line, ignore case sensitivity and return whether a file is sorted or not. Sorting is done based on one or more sort keys extracted from each line of input.
- By default, the entire input is taken as sort key. Blank space is the default field separator.
sort [ OPTION ] filename
|-b, --ignore-leading-blanks||ignore leading blanks|
|-d, --dictionary-order||consider only blanks and alphanumeric characters|
|-f, --ignore-case||fold lower case to upper case characters|
|-g, --general-numeric-sort||compare according to general numerical value|
|-i, --ignore-nonprinting||consider only printable characters|
|-M, --month-sort||compare (unknown) < 'JAN' < ... < 'DEC'|
|-h, --human-numeric-sort||compare human readable numbers (e.g., 2K 1G)|
|-n, --numeric-sort||compare according to string numerical value|
|-R, --random-sort||shuffle, but group identical keys. See shuf(1)|
|--random-source=,FILE/||get random bytes from FILE|
|-r, --reverse||reverse the result of comparisons|
|--sort=,WORD/||sort according to WORD: general-numeric -g, human-numeric -h, month -M, numeric -n, random -R, version -V|
|-V, --version-sort||natural sort of (version) numbers within text|
The sort command is another command that has an abundance of options, and only a few are shown in the table above. The example below is a very straightforward sort. The cat command shows the random names of states. Then the sort command produces an output list of the states sorted in alphabetic order. NOTE: the original file, states, is not altered at all. The new list is simply output to the terminal.
pbmac@pbmac-server $ cat states California New York Florida Texas North Carolina Alabama South Dakota Washington Georgia Ohio pbmac@pbmac-server $ sort states Alabama California Florida Georgia New York North Carolina Ohio South Dakota Texas Washington
With the plethora of options sort can sort according to alpha or numeric values, or reverse sort. For columnar data it can sort by any one of the columns, and specifying any character as the column delimiter.
This command is very useful and very powerful.
The diff Command
The diff command is used to display the differences in the files by comparing the files line by line. Unlike its fellow members, cmp and comm, it tells us which lines in one file have to be changed to make the two files identical.
The important thing to remember is that diff uses certain special symbols and instructions that are required to make two files identical. It tells you the instructions on how to change the first file to make it match the second file.
Special symbols are:
a : add c : change d : delete
diff [ OPTIONS ] File1 File2
|-b||Ignore spacing differences.|
|-c||Display a list of differences with three lines of context.|
|-i||Ignore case differences.|
|-t||Expand tab characters in output lines.|
|-u||Output results in unified mode, which presents a more streamlined format.|
|-w||Ignore spacing differences and tabs.|
Let's say we have two files with names a.txt and b.txt containing 5 American states.
pbmac@pbmac-server $ cat states.1 New York Florida Texas Alabama South Dakota Washington pbmac@pbmac-server $ cat states.2 California New York Florida Texas North Carolina Alabama Washington Ohio
Now, applying diff command without any option we get the following output:
pbmac@pbmac-server $ diff states.1 states.2 0a1 > California 3a5 > North Carolina 5d6 < South Dakota 6a8 > Ohio
NOTE: neither file is altered, only output of the differences is sent to the terminal.
Let’s take a look at what this output means. The first line of the diff output will contain:
- Line numbers corresponding to the first file
- A special symbol
- Line numbers corresponding to the second file.
Like in our case, 0a1 which means after lines 0(at the very beginning of file) you have to add California to match the second file line number 1. It then tells us what those lines are in each file proceeded by the symbol:
- Lines preceded by a < are lines from the first file.
- Lines preceded by > are lines from the second file.
Next line contains 3a5 which means at line 3 of the first file we need to add line 5 from the second file. The we have to delete from line 5 to line 6 (BUT not deleting line 6) from the first file. Finally, after line 6 of the first file we add line 8 from the second file.
"SORT command in Linux/Unix with examples" by Mohak Agrawal, Geeks for Geeks is licensed under CC BY-SA 4.0
"diff command in Linux with examples" by AKASH GUPTA 6, Geeks for Geeks is licensed under CC BY-SA 4.0