As we already seen basic text filtering part of awk. If you have not read please go through that article first. Here is the link this is the prerequisite for this article. Now we see predefined variables which are present in awk.
Predefined Variables in AWK:
Let's start with the predefined variable in AWK, It contains predefined variables which contain values.
RS
: input record separator
FS
: input field separator
ORS
: output record separator
OFS
: output field separator
NF
: number of fields in current input line
NR
: number of the current input line
FILENAME
: current input file name
BEGIN and END in AWK:
We already know that awk is also language so when we write code in awk then it starts with BEGIN keyword and ends with END keyword.
AWK starts with BEGIN before reading any line from input and ENDs with after all the lines read from the input
Let see with some example:
awk ‘ BEGIN {printf “no of peoples whose marks is greater than 85 is “} $4 >= 85 {counter+=1} END { printf “%s\n”, counter} ‘ example.txt
OUTPUT: no of peoples whose marks is greater than 85 is 5
When awk starts reading the line from input it first executed BEGIN block similarily END block before exits.
- FILENAME keyword:
print fileName using Awk.
awk ‘{print FILENAME;}’ example.txt
OUTPUT:
example.txt example.txt example.txt example.txt example.txt example.txt
or
awk ' BEGIN {} END { print FILENAME} ' example.txt
OUTPUT:
example.txt
FILENAME is printed in the console. no of filename printed in the console is depends no of lines present in the filename because by default awk read file line by line.
- RS: Input Record Separator, while parsing text default record separator is the newline(\n). Accordingly, we can update depends upon the requirement
Let see with some examples.
echo “SEQ Name Subject Marks;1) Saurav Physics 80;2) Deepak Maths 90;3) Dhoni Biology 87;4) Kedar English 85;5) Pandya History 89;”
|
awk ‘BEGIN { RS=”;” ;} {print $0}’
SEQ Name Subject Marks
1) Saurav Physics 80
2) Deepak Maths 90
3) Dhoni Biology 87
4) Kedar English 85
5) Pandya History 89
As we can see echo “SEQ…blah blah” is separated by “(;)semicolon” awk reading line by line which separated by “;”.
- FS is Field Separator by default it is set as the tab in case if we want to update or in case if columns are separated by different delimiters like ‘,(comma in case of csv)’ “:” etc then it is very helpful. we can also use ‘-F’ flag as well to achieve this.
Let see with some examples:
cat examples_new.txt
SEQ:Name:Subject:Marks 1):Saurav:Physics:80 2):Deepak:Maths:90 3):Dhoni:Biology:87 4):Kedar:English:85 5):Pandya:History:89
awk 'BEGIN {FS=":"} {print $1,$2,$3,$4 }' example_new.txt
SEQ Name Subject Marks 1) Saurav Physics 80 2) Deepak Maths 90 3) Dhoni Biology 87 4) Kedar English 85 5) Pandya History 89
or
awk -F':' '{print $1,$2,$3,$4 }' example_new.txt
SEQ Name Subject Marks
1) Saurav Physics 80
2) Deepak Maths 90
3) Dhoni Biology 87
4) Kedar English 85
5) Pandya History 89
As we can we see with example input is separated(delimiter) by the colon(“:”) awk reading depends upon the delimiter.
- OFS: Output Field Separator, while parsing text default output separator is tab but in case if we want to update then we change to any delimiter. Let see with some examples:
awk ‘BEGIN {OFS=”:”} {print $1,$2,$3 ,$4 }’ example.txt
OUTPUT:
SEQ:Name:Subject:Marks 1):Saurav:Physics:80 2):Deepak:Maths:90 3):Dhoni:Biology:87 4):Kedar:English:85 5):Pandya:History:89
If we want to convert it to CSV:
awk ‘BEGIN {OFS=”,”} {print $1,$2,$3 ,$4 }’ example.txt
SEQ,Name,Subject,Marks 1),Saurav,Physics,80 2),Deepak,Maths,90 3),Dhoni,Biology,87 4),Kedar,English,85 5),Pandya,History,89
Similarly, we can update OFS as we want.
- ORS: Output Record Separator, while parsing text default output record separator is a newline(“\n”) but in case if we want to update then we change to any delimiter. Let see with some examples:
awk ‘BEGIN {ORS=”:”} {print $1,$2,$3 ,$4 }’ example.txt
OUTPUT:
SEQ Name Subject Marks:1) Saurav Physics 80:2) Deepak Maths 90:3) Dhoni Biology 87:4) Kedar English 85:5) Pandya History 89
- NF: Number of fields present in the current line while reading input by awk it keeps tracks of the number of fields present in the current row depends upon the delimiters.
- NR: Number of the current input line, while reading input by awk it keeps tracks of the number of lines currently it reading depends upon the delimiters.
Let see with some of the examples:
awk ‘{print “CURRENT LINE: “ NF “\t TOTAL FIELDS in present line:” NR}’ example.txt
Output:
CURRENT LINE: 4 TOTAL FIELDS in present line:1 CURRENT LINE: 4 TOTAL FIELDS in present line:2 CURRENT LINE: 4 TOTAL FIELDS in present line:3 CURRENT LINE: 4 TOTAL FIELDS in present line:4 CURRENT LINE: 4 TOTAL FIELDS in present line:5 CURRENT LINE: 4 TOTAL FIELDS in present line:6
Here is the end of the tutorial will see coding constructs like if and else, loops etc in the next article.
Feedbacks are always welcome
Happy Coding :)
No comments:
Post a Comment