The AWK command is a powerful text-processing and pattern-scanning tool in Linux used for manipulating data and generating formatted reports. It reads files line by line, applies patterns, and performs specified actions on matching lines.
Example:
Consider the following text file as the input file for all cases below:
Command:
$cat employee.txt
Output:
Print All Lines (Default Behavior)
By default Awk prints every line of data from the specified file.
Command:
$ awk '{print}' employee.txtOutput:
In the above example, no pattern is given. So the actions are applicable to all the lines. Action print without any argument prints the whole line by default, so it prints all the lines of the file without failure.
Basic AWK Syntax:
awk [options] 'pattern {action}' input-file > output-fileCommon awk Command Options
Here are the most commonly used options of the awk command in Linux
| Option | Description |
|---|
-F | Sets a custom field separator |
-f | Reads awk program from a file |
-v | Assigns a variable before execution |
| --help | Display the help informantion |
--version | Display the version of awk |
AWK Command Options with Examples
1. Using -F (Field Separator Option)
- The -F option sets a custom field separator.
- By default, fields are separated by spaces, but we can change this using -F.
Example: Use Space (default separator)
awk -F' ' '{print $1, $4}' employee.txtOutput:
Here,
- -F ' ' tells AWK to use space as a separator.
$1 = first column (Name)$4 = fourth column (Salary)
2. Using -f (Program File Option)
You can write your AWK code in a file and execute it with -f.
Example: Create a script file named print_salary.awk:
{ print $1, "has salary", $4 }
awk -f print_salary.awk employee.txt
Output:
- The program file is executed for every line of the input file.
3. Using -v (Variable Assignment Option)
The -v option lets you define variables before AWK begins processing.
Example 1: Define a Custom Message
awk -v msg="Employee Details:" 'BEGIN {print msg}'
- awk -v msg="Employee Details:" defines a variable and prints it before processing starts.
Example 2: Use Variable in Condition
awk -v limit=40000 '$4 > limit {print $1, $4}' employee.txtOutput:
- The variable limit is assigned value 40000.
- The command prints employees whose salary is greater than 40000.
More Examples of the awk command in Linux
1. Search Lines with a Keyword
It is used to find and print every line in the employee.txt file that contains the word "manager."
$ awk '/manager/ {print}' employee.txt Output:
ajay manager account 45000
varun manager sales 50000
amit manager account 47000
In the above example, the awk command prints all the line which matches with the ‘manager’.
2. Print Specific Columns
For each record i.e line, the awk command splits the record delimited by whitespace character by default and stores it in the $n variables. If the line has 4 words, it will be stored in $1, $2, $3 and $4 respectively. Also, $0 represents the whole line.
$ awk '{print $1,$4}' employee.txt Output:
ajay 45000
sunil 25000
varun 50000
amit 47000
tarun 15000
deepak 23000
sunil 13000
satvik 80000
In the above example, $1 and $4 represents Name and Salary fields respectively.
3. Use of NR built-in variables (Display Line Number)
It is used to add line numbers to each line of the employee.txt file and then print the numbered lines to the standard output (your terminal).
$ awk '{print NR,$0}' employee.txt Output:
1 ajay manager account 45000
2 sunil clerk account 25000
3 varun manager sales 50000
4 amit manager account 47000
5 tarun peon sales 15000
6 deepak clerk sales 23000
7 sunil peon sales 13000
8 satvik director purchase 80000
In the above example, the awk command with NR prints all the lines along with the line number.
4. Use of NF built-in variables (Display Last Field)
$ awk '{print $1,$NF}' employee.txt Output:
ajay 45000
sunil 25000
varun 50000
amit 47000
tarun 15000
deepak 23000
sunil 13000
satvik 80000
In the above example $1 represents Name and $NF represents Salary. We can get the Salary using $NF , where $NF represents last field.
5. Another use of NR built-in variables (Display Line From 3 to 6)
It is used to print specific lines (from line 3 to line 6, inclusive) of a file named employee.txt, along with their corresponding line numbers.
$ awk 'NR==3, NR==6 {print NR,$0}' employee.txt Output:
3 varun manager sales 50000
4 amit manager account 47000
5 tarun peon sales 15000
6 deepak clerk sales 23000
For the given text file:
Command:
cat geeksforgeeks.txt
Output:
A B C
Tarun A12 1
Man B6 2
Praveen M42 3
AWK Built-in Variables
| Variable | Meaning | Description |
|---|
$0 | Entire line | Represents the whole input line |
$1, $2, … | Field variables | Represent individual fields |
NR | Record number | Current line number being processed |
NF | Number of fields | Total number of fields in a line |
FS | Field separator | Character separating fields (default: space) |
RS | Record separator | Character separating records (default: newline) |
OFS | Output field separator | Used between output fields (default: space) |
ORS | Output record separator | Used between output records (default: newline) |
1) To print the first item along with the row number(NR) separated with ” - “ from each line in geeksforgeeks.txt
It is used to read a file line by line, prefix each line with its line number, and then print only the line number followed by the first word of that line.
$ awk '{print NR "- " $1 }' geeksforgeeks.txt 1 - A
2 - Tarun
3 – Manav
4 - Praveen
2) To return the second column/item from geeksforgeeks.txt:
The question should be:- To return the second column/item from geeksforgeeks.txt:
$ awk '{print $2}' geeksforgeeks.txt B
A12
B6
M42
3) To print any non empty line if present
It is used to process the file geeksforgeeks.txt, but it will not produce any output.
$ awk 'NF < 0' geeksforgeeks.txt
here NF should be 0 not less than and the user have to print the line number also:
correct answer : awk 'NF == 0 {print NR}' geeksforgeeks.txt
OR
awk 'NF <= 0 {print NR}' geeksforgeeks.txt
0
4) To find the length of the longest line present in the file
It is used to find and print the length of the longest line in the file named geeksforgeeks.txt.
$ awk '{ if (length($0) > max) max = length($0) } END { print max }' geeksforgeeks.txt13
5) To count the lines in a file
It is used to count and print the total number of lines in a file.
$ awk 'END { print NR }' geeksforgeeks.txt 3
6) Printing lines with more than 10 characters
It is used to filter and print only those lines from geeksforgeeks.txt that have more than 10 characters.
$ awk 'length($0) > 10' geeksforgeeks.txt
Tarun A12 1
Praveen M42 3
7) To find/check for any string in any specific column
It is used to find and print every line from the file geeksforgeeks.txt where the third piece of data (or column) in that line is exactly "B6".
$ awk '{ if($3 == "B6") print $0;}' geeksforgeeks.txt8) To print the squares of first numbers from 1 to n say 6
This command is used to generate and print the squares of numbers from 1 to 6.
$ awk 'BEGIN { for(i=1;i<=6;i++) print "square of", i, "is",i*i; }' square of 1 is 1
square of 2 is 4
square of 3 is 9
square of 4 is 16
square of 5 is 25
square of 6 is 36
Built-In Variables In Awk
Awk's built-in variables include the field variables—$1, $2, $3, and so on ($0 is the entire line) — that break a line of text into individual words or pieces called fields.
- NR: NR command keeps a current count of the number of input records. Remember that records are usually lines. Awk command performs the pattern/action statements once for each record in a file.
- NF: NF command keeps a count of the number of fields within the current input record.
- FS: FS command contains the field separator character which is used to divide fields on the input line. The default is "white space", meaning space and tab characters. FS can be reassigned to another character (typically in BEGIN) to change the field separator.
- RS: RS command stores the current record separator character. Since, by default, an input line is the input record, the default record separator character is a newline.
- OFS: OFS command stores the output field separator, which separates the fields when Awk prints them. The default is a blank space. Whenever print has several parameters separated with commas, it will print the value of OFS in between each parameter.
- ORS: ORS command stores the output record separator, which separates the output lines when Awk prints them. The default is a newline character. print automatically outputs the contents of ORS at the end of whatever it is given to print.
Explore
Linux/Unix Tutorial
5 min read
Getting Started with Linux
Installation with Linux
Linux Commands
Linux File System
Linux Kernel
Linux Networking Tools
Linux Process
Linux Firewall
Shell Scripting & Bash Scripting