The awk command in Linux is a powerful tool for processing text files, particularly those formatted as columns of data. It’s a scripting language that’s designed for text processing and is included by default in most Unix-like operating systems.
Here are some of the things you can do with awk:
Print Columns: The most basic use of awk is to print columns of data. For example, if you have a file called data.txt with the following content:
John 25 Engineer
Jane 28 Doctor
You can print the first column (names) with the following command:
awk '{print $1}' data.txt
Output:
John
Jane
Filter Rows: You can use awk to filter rows based on some condition. For example, to print only the rows where the second column (age) is greater than 26:
awk '$2 > 26' data.txt
Output:
Jane 28 Doctor
Perform Calculations: awk can perform calculations on the data. For example, to add 5 to the age of each person:
awk '{$2 = $2 + 5; print}' data.txt
Output:
John 30 Engineer
Jane 33 Doctor
Text Substitution: You can use awk to substitute text. For example, to replace “Engineer” with “Software Engineer”:
awk '{gsub("Engineer","Software Engineer"); print}' data.txt
Output:
John 25 Software Engineer
Jane 28 Doctor
Pattern Matching: awk can also perform pattern matching. For example, to print lines that contain “Doctor”:
awk '/Doctor/ {print}' data.txt
Output:
Jane 28 Doctor
Multiple Commands: You can use multiple commands in a single awk script. For example, to print the names of people who are not doctors:
awk '!/Doctor/ {print $1}' data.txt
Output:
John
Built-in Variables: awk has several built-in variables. For example, NF (number of fields) represents the number of columns. To print the last column of each row:
awk '{print $NF}' data.txt
Output:
Engineer
Doctor
User-Defined Variables: You can define your own variables in awk. For example, to calculate the average age:
awk '{total += $2; count++} END {print total/count}' data.txt
Output:
26.5
Functions: awk supports several built-in functions. For example, length returns the length of a string. To print the length of each name:
awk '{print length($1)}' data.txt
Output:
4
4
Passing Variables: You can pass variables to awk using the -v option. For example, to print rows where the age is greater than a certain value:
awk -v age=26 '$2 > age' data.txt
Output:
Jane 28 Doctor
File Processing: awk can process multiple files. For example, if you have another file data2.txt:
Alice 30 Lawyer
Bob 35 Engineer
You can print the names from both files:
awk '{print $1}' data.txt data2.txt
Output:
John
Jane
Alice
Bob
Complex Conditions: awk supports complex conditions. For example, to print rows where the name starts with ‘J’ and the age is less than 30:
awk '/^J/ && $2 < 30' data.txt
Output:
John 25 Engineer
Jane 28 Doctor
These examples should give you a good idea of the power and flexibility of awk. It’s a very versatile tool for text processing in Linux.