The name awk comes from the initials of its designers: Alfred V. Aho, Peter J. Weinberger, and Brian W. Kernighan. The original version of awk was written in 1977 at AT&T Bell Laboratories. In 1985 a new version made the programming language more powerful, introducing user-defined functions, multiple input streams, and computed regular expressions. This new version became generally available with Unix System V Release 3.1.
.awk is a programming language designed to search for match patterns and perform actions on files.
AWK Commands
Download file: wget nishantmunjal.com/dataset/mywords
$awk 'length($1) > 5 {print}' mywords$awk 'length($1) > 5' mywords$awk '$1 ~ /^[b,c]/ {print $1}' mywords The above script print all the words that begin with b or c character. The regular expression is placed between two slash characters.$awk 'NR % 2 == 0 {print}' mywordsNR
is a built-in variable that refers to the current line being processed. The above program prints each second record of themywords
file. Modulo dividing theNR
variable we get an even line. $awk '{print NR, $0}' mywords HereNR
variable will print the line number and the$0
variable refers to the whole record.
Download File: wget nishantmunjal.com/dataset/code.c
$awk '{print substr($0, 4)}' code.csubstr()
function. It prints a substring from the given string. We apply the function on each line, skipping the first three characters. In other words, we print each record from the fourth character till its end.
The Match Function
The match()
is a built-in string manipulation function. It tests if the given string contains a regular expression pattern. The first parameter is the string, the second is the regex pattern.
Download File: wget nishantmunjal.com/dataset/mywords.c
$awk 'match($0, /^[c,b]/)' mywords The program prints those lines that begin with c or b. The regular expression is placed between two slash characters.$awk 'match($0, /i/) {print $0 " has i character at " RSTART}' mywords Thematch()
function sets theRSTART
variable; it is the index of the start of the matching pattern. This prints those words that contain the 'i' character. In addition, it prints the first occurrence of the character.$ awk -F: '{print $1, $7}' /etc/passwd | head -7$ echo "Jane 17#Tom 23#Mark 34" | awk 'BEGIN {RS="#"} {print $1, "is", $2, "years old"}'
Jane is 17 years old
Tom is 23 years old
Mark is 34 years old
TheRS
is the input record separator, by default a newline. In the example, we have relevant data separated by the # character. TheRS
is used to strip them. AWK can receive input from other commands likeecho
.
Leave a Reply