sauravomar: February 2019

Saturday, 23 February 2019

AWK Predefined Variables

As we already seen basic text filtering part of awk. If you have not read please go through that article first. Here is the link this is the prerequisite for this article. Now we see predefined variables which are present in awk.

Predefined Variables in AWK:

Let's start with the predefined variable in AWK, It contains predefined variables which contain values.

RS : input record separator

FS : input field separator

ORS : output record separator

OFS : output field separator

NF : number of fields in current input line

NR : number of the current input line

FILENAME : current input file name

BEGIN and END in AWK:

We already know that awk is also language so when we write code in awk then it starts with BEGIN keyword and ends with END keyword.

AWK starts with BEGIN before reading any line from input and ENDs with after all the lines read from the input

Let see with some example:

awk ‘ 
   BEGIN 
     {printf “no of peoples whose marks is greater than 85 is “} 
     $4 >= 85 {counter+=1}
   END { printf “%s\n”, counter} ‘ 
example.txt

OUTPUT:
no of peoples whose marks is greater than 85 is 5

When awk starts reading the line from input it first executed BEGIN block similarily END block before exits.

FILENAME keyword:

print fileName using Awk.

awk ‘{print FILENAME;}’ example.txt

OUTPUT:

example.txt
example.txt
example.txt
example.txt
example.txt
example.txt

or

awk ' BEGIN {} END { print FILENAME} ' example.txt

OUTPUT:

example.txt

FILENAME is printed in the console. no of filename printed in the console is depends no of lines present in the filename because by default awk read file line by line.

RS: Input Record Separator, while parsing text default record separator is the newline(\n). Accordingly, we can update depends upon the requirement

Let see with some examples.

echo “SEQ Name Subject Marks;1) Saurav Physics 80;2) Deepak Maths 90;3) Dhoni Biology 87;4) Kedar English 85;5) Pandya History 89;” 
 | 
awk ‘BEGIN { RS=”;” ;} {print $0}’

SEQ Name Subject Marks

1) Saurav Physics 80

2) Deepak Maths 90

3) Dhoni Biology 87

4) Kedar English 85

5) Pandya History 89

As we can see echo “SEQ…blah blah” is separated by “(;)semicolon” awk reading line by line which separated by “;”.

FS is Field Separator by default it is set as the tab in case if we want to update or in case if columns are separated by different delimiters like ‘,(comma in case of csv)’ “:” etc then it is very helpful. we can also use ‘-F’ flag as well to achieve this.

Let see with some examples:

cat examples_new.txt

SEQ:Name:Subject:Marks
1):Saurav:Physics:80
2):Deepak:Maths:90
3):Dhoni:Biology:87
4):Kedar:English:85
5):Pandya:History:89

awk 'BEGIN {FS=":"} {print $1,$2,$3,$4 }' example_new.txt

SEQ Name Subject Marks
1) Saurav Physics 80
2) Deepak Maths 90
3) Dhoni Biology 87
4) Kedar English 85
5) Pandya History 89

or

awk -F':' '{print $1,$2,$3,$4 }' example_new.txt
SEQ Name Subject Marks
1) Saurav Physics 80
2) Deepak Maths 90
3) Dhoni Biology 87
4) Kedar English 85
5) Pandya History 89

As we can we see with example input is separated(delimiter) by the colon(“:”) awk reading depends upon the delimiter.

OFS: Output Field Separator, while parsing text default output separator is tab but in case if we want to update then we change to any delimiter. Let see with some examples:

awk ‘BEGIN {OFS=”:”} {print $1,$2,$3 ,$4 }’ example.txt

OUTPUT:

SEQ:Name:Subject:Marks
1):Saurav:Physics:80
2):Deepak:Maths:90
3):Dhoni:Biology:87
4):Kedar:English:85
5):Pandya:History:89

If we want to convert it to CSV:

awk ‘BEGIN {OFS=”,”} {print $1,$2,$3 ,$4 }’ example.txt

SEQ,Name,Subject,Marks
1),Saurav,Physics,80
2),Deepak,Maths,90
3),Dhoni,Biology,87
4),Kedar,English,85
5),Pandya,History,89

Similarly, we can update OFS as we want.

ORS: Output Record Separator, while parsing text default output record separator is a newline(“\n”) but in case if we want to update then we change to any delimiter. Let see with some examples:

awk ‘BEGIN {ORS=”:”} {print $1,$2,$3 ,$4 }’ example.txt

OUTPUT:

SEQ Name Subject Marks:1) Saurav Physics 80:2) Deepak Maths 90:3) Dhoni Biology 87:4) Kedar English 85:5) Pandya History 89

NF: Number of fields present in the current line while reading input by awk it keeps tracks of the number of fields present in the current row depends upon the delimiters.
NR: Number of the current input line, while reading input by awk it keeps tracks of the number of lines currently it reading depends upon the delimiters.

Let see with some of the examples:

awk ‘{print “CURRENT LINE: “ NF “\t TOTAL FIELDS in present line:” NR}’ example.txt

Output:

CURRENT LINE: 4 TOTAL FIELDS in present line:1
CURRENT LINE: 4 TOTAL FIELDS in present line:2
CURRENT LINE: 4 TOTAL FIELDS in present line:3
CURRENT LINE: 4 TOTAL FIELDS in present line:4
CURRENT LINE: 4 TOTAL FIELDS in present line:5
CURRENT LINE: 4 TOTAL FIELDS in present line:6

Here is the end of the tutorial will see coding constructs like if and else, loops etc in the next article.

Feedbacks are always welcome

Happy Coding :)

Awk Tutorial for Beginners

What is AWK?

AWK, one of the most prominent text-processing or text filtering utility on GNU/Linux. Very and powerful programming language, solve complex problems in very less line of codes.
Its name is derived from the family names of its authors − Alfred Aho, Peter Weinberger, and Brian Kernighan.
Maintained by FSF (Free Software Foundation).
Basic Syntax of awk is awk ‘options’ file.

Print file using awk?

Its similar to cat /etc/resolve.conf. It prints file content in the console.

awk ‘//{print}’ /etc/resolv.conf

or

awk ‘{print}’ /etc/resolv.conf

difference between the above two examples is in the first example it will print or if you want to print a specific line which contains patterns, whereas in the second example it's just print the content in the console, for example,

awk ‘/8.8.8.8/{print}’ /etc/resolv.conf

it will print line which contains “8.8.8.8”. the basic syntax of the first example is awk ‘/pattern/print’ file.

pattern: can be regex or string.

awk ‘/^saurav/{print}’ /etc/passwd.

in the above example line which starts with saurav will print.

awk ‘/*sql$/{print}’ /etc/passwd

in the above example, the line ends with sql will print, likewise. we can use regex to print matching pattern.

Print Column using awk?

By default IFS (Intermediate field separator) in bash is space. similarily in AWK default, IFS is tab or space.

Here is the file which contains 3 columns which I gonna used to explain:

SEQ Name Subject Marks
1) Saurav Physics 80
2) Deepak Maths 90
3) Dhoni Biology 87
4) Kedar English 85
5) Pandya History 89

Printing 3rd column: Here we are going to print 3 rd column

awk ‘//{print $3}’ example.txt

Output:

Subject
Physics
Maths
Biology
English
History

Let see how to print column 2 and 4

awk ‘//{print $2 $4}’ example.txt

Output:

NameMarks
Saurav80
Deepak90
Dhoni87
Kedar85
Pandya89

here we can see awk is printing column which is not separated. if you want to separate columns use ‘,’ (comma).

awk ‘//{print $2, $4;}’ example.txt

Output:

Name Marks
Saurav 80
Deepak 90
Dhoni 87
Kedar 85
Pandya 89

Using printf in awk?

Printf helps here to format the output to print.

For Example:

awk ‘NR>1 {printf “Marks=%d Subject=%s\n”,$4, $3 }’ example.txt

Output:
Marks=80 Subject=Physics
Marks=90 Subject=Maths
Marks=87 Subject=Biology
Marks=85 Subject=English
Marks=89 Subject=History

As you can see in the above example printf function similar in C language works here.

Comparison Operators in AWK:

In awk, you can compare columns and print in the console

For Example:

awk ‘$4 > 85 {print;}’ example.txt

SEQ Name Subject Marks
2) Deepak Maths 90
3) Dhoni Biology 87
5) Pandya History 89

in the above example print the line whose 4 th column (marks) is greater than 85.

So there are different comparison operators

>:greater than
<:less than
>=:greater than or equal to
<=: less than or equal to
==:equal to
!=: not equal to
some_value ~ / pattern/: – true if some_value matches the pattern
some_value !~ / pattern/: – true if some_value does not match the pattern.

If we want to print the marks of Deepak:

awk ‘$2 ~ “Deepak” { print $0 ; }’ example.txt

Output:
2) Deepak Maths 90

similarily we can get the matching row using comparison operators.

Compound operation in AWK:

In awk, we can combine multiple expression to filter text. We can use && (and) and || (or) operators to achieve this.

Let see some examples.

Print marks of the people who have marks greater than 85 in History.

awk ‘($4 >= 85 ) && ($3 ~ “History”) { print $0 ; }’ example.txt

OUTPUT:

5) Pandya History 89

Print marks of the people who have marks greater than 85 or whose subject is History.

awk '($4 >= 85 ) || ($3 ~ "History") { print  $0 ; }' example.txt

OUTPUT:

2)  Deepak    Maths      90
3)  Dhoni    Biology    87
4)  Kedar    English    85
5)  Pandya    History    89

similarily we can achieve combining multiple expression to filter the text.

Next Keyword in AWK:

next keyword is somewhat similar as continue in a different programming language like java, scala. This really helps when there are the multiple expression to evaluate and the only one you want to print skip rest all the expressions.

For Example:

awk ‘ FNR == 1 {next};
      $4 >= 85 { printf “%s\t%s\n”, $0,”EXEMPTION” ; next} 
      $4 < 85 {printf “%s\t%s\n”, $0,”PASSED”;} ‘ 
 example.txt

Output:

1) Saurav Physics 80 PASSED
2) Deepak Maths 90 EXEMPTION
3) Dhoni Biology 87 EXEMPTION
4) Kedar English 85 EXEMPTION
5) Pandya History 89 EXEMPTION

In the above example as we can see

first line FNR == 1 {next} check if its first line or row then go to next.

second line $4 >= 85 { printf “%s\t%s\n”, $0,”EXEMPTION” ; next} itcheck if the 4th column(marks) is greater than 85 then print and go to the next line .

Variables and Numeric Expressions:

Variables are place holders which store some value which stored in memory like other programming languages.

Syntax:

variable=value

Example:

marks=10
name=saurav

Numeric expressions are the expression which does numeric expressions. Like adding or dividing some numbers similar to other programming languages.

Syntax: operand operator operand

Example:

var1=1
var2=2
var3= var1 + var2

Let see some examples:

Print line number with every line in the console.

awk ‘FNR==1 {next};

line= $0 //store content reads by awk

{ line_no=+1 ; printf “%d\t%s\n”, line_no,line ; }’ //  increment line_no with every line read

example.txt

OUTPUT:

1 1) Saurav Physics 80
2 2) Deepak Maths 90
3 3) Dhoni Biology 87
4 4) Kedar English 85
5 5) Pandya History 89

Happy Coding :).

Tuesday, 12 February 2019

Basic tutorial of SED (Stream Editior) for beginners

What is SED?

Sed is stream editor and ultimate editor (non-interactive text editor)for modifying files automatically. Commonly used in the Linux/Unix based system. Sed inputs in the form of a stream and update the stream or input depends on the instructions.

Many System developers or admins use this commands on daily basis to update or replace text or filter from the strings or files.

How to use?

I will use the given file reference to explain the commands:

for seq in `seq 1 5`; do echo “CAT_$seq” >> exp.txt; done

the above command will create file “exp.txt” which has content CAT_1 . to CAT_5 separated by lines.

Delimiter IN SED:

Most of the people know that only ‘/’ slash is a delimiter this is a myth you can use like “|”, “,”, “_”, “:” etc.

Example:

echo "CAT"| sed 's:CAT:DOG:'
echo "CAT"| sed 's|CAT|DOG|'
echo "CAT"| sed 's_CAT_DOG_'
echo "CAT"| sed 's;CAT;DOG;'
echo "CAT"| sed 's,CAT,DOG,'

so all above command yields the same result.

How to print line no using sed:

Using “=” we can print line and line no:

Example:

sed ‘=’ exp.txt

OUTPUT:

1
CAT_1
2
CAT_2
3
CAT_3
4
CAT_4
5
CAT_5

Print file using SED:

Example: Print from line no1 to 5.

sed '1,3p' exp.txt

OUTPUT:

CAT_1
CAT_1
CAT_2
CAT_2
CAT_3
CAT_3
CAT_4
CAT_5

By default, each line of input is printed to the standard output, after all of the commands have been applied to it to suppress this behavior we have -n

sed -n '1,5p' exp.txt

OUTPUT:

CAT_1
CAT_2
CAT_3

Print Non-consecutive lines:

How to print non-consecutive lines like print from line 1to3 and 5.

Example:

sed -n -e '1,3p' -e '5p' exp.txt

Output:

CAT_1
CAT_2
CAT_3
CAT_5

Here we have used -e flag basically means append the editing commands specified by the command argument to the list of commands.

it’s similar to execute multiple sed commands same as below.

sed -n '1,3p'  exp.txt ; sed -n 5p exp.txt

Delete Lines and Print:

How to delete or remove some of the lines and print rest all the lines.

Example:

sed '3d'  exp.txt

so above command delete 3rd line and print all the lines.

Output:

CAT_1
CAT_2
CAT_4
CAT_5

Inserting spaces in files:

Using “G” we can insert an empty line with every non-empty line present in the file.

Example:

sed ‘G’ exp.txt

Output:

CAT_1

CAT_2

CAT_3

CAT_4

CAT_5

you can also do sed ‘G; G’ exp.txt to insert 2 blank lines, similarly, no of G’s separated by semicolon insert blank line same as no of “G”

In Place Editing in Sed:

Using the “-i” flag we can edit the file in place and changes are updated in the same file without printing output of file in the console.

Example:

sed -in 's/CAT/DOG/' exp.txt

Output:

DOG_1
DOG_2
DOG_3
DOG_4
DOG_5

Occurrences of pattern in SED:

Without giving any occurrence first matched character is replaced on giving “g” flag all occurrences are replaced. In case if you want to modify a particular pattern in sed then you can do like below.

Example:

echo "CAT CAT CAT CAT CAT"| sed 's/CAT/DOG/2'

OUTPUT: CAT DOG CAT CAT CAT

as you can see in the above example the second occurrence is replaced.

if you want to replace from second onwards you can do like this

echo "CAT CAT CAT CAT CAT"| sed 's/CAT/DOG/2g'

OUTPUT: CAT DOG DOG DOG DOG

Command S for substitution:

it will replace the occurrence of pattern to a newly given pattern

Replace String using String:

Example: Let's replace CAT_2 to DOG_2

sed ‘s/CAT_2/DOG_2/’ exp.txt

or

cat exp.txt | sed 's/CAT_2/DOG_2/'

OUTPUT:

CAT_1
DOG_2
CAT_3
CAT_4
CAT_5

It will replace CAT_2to DOG_2

Note: Most of the Linux utilities works on reading the file line by line similarily sed works, in the same way, it will replace the first occurrence of pattern and go to next line if you want to replace all the occurrences then use “g” means global.

Example:

sed ‘s/cat_2/DOG_2/g’ exp.txt

or

cat exp.txt | sed 's/CAT_2/DOG_2/g'

We can also uses a number instead of “g” which will tell every number th position character is replaced

Example:

echo “my name is name and name” | sed ‘s/name/saurav/2’

Output: my name is saurav and name

second position name is replaced with saurav

Replace String using REGEX:

Example Replace CAT from DOG

sed ‘s/^CAT*/DOG/’ exp.txt

OUTPUT:

DOG_1
DOG_2
DOG_3
DOG_4
DOG_5

Sometimes we used -E flag while regex matching in sed for example

sed -E ‘s/^CAT*/DOG/’ exp.txt

This is an extended regular expression flag, this means the behavior of a few characters: ‘?’, ‘+’, ‘()’,’{}’ etc does not require to escape while in regular (or not using -E flag) we need to escape. Extended regular expressions have more power than normal

Example:

, but sed scripts that treated “+”

echo “123 abc” | sed ‘s/[0–9]+//’

Output: 123 abc

echo “123 abc” | sed -E‘s/[0–9]+//’

Output: abc

so in above example as you can “+ ” is special character when use “-E” sed take as regular expression where as without “-E” sed take as normal string.

That's it after going through this article you can get an idea of how sed works and different flags present in flags. Some of the flags are not covered like”r” (for reading from the file), “w” for writing in file etc. these are basic and easy sed flags.

In case of any doubts or concerns please comment below.

Happy Coding . :)

sauravomar

Saturday, 23 February 2019

AWK Predefined Variables

Predefined Variables in AWK:

BEGIN and END in AWK:

Awk Tutorial for Beginners

What is AWK?

Print file using awk?

Print Column using awk?

Using printf in awk?

Comparison Operators in AWK:

Compound operation in AWK:

Next Keyword in AWK:

Variables and Numeric Expressions:

Tuesday, 12 February 2019

Basic tutorial of SED (Stream Editior) for beginners

What is SED?

How to use?

I will use the given file reference to explain the commands:

for seq in `seq 1 5`; do echo “CAT_$seq” >> exp.txt; done

the above command will create file “exp.txt” which has content CAT_1 . to CAT_5 separated by lines.

Delimiter IN SED:

How to print line no using sed:

Using “=” we can print line and line no:

Example:

sed ‘=’ exp.txt

OUTPUT:

1 CAT_1 2 CAT_2 3 CAT_3 4 CAT_4 5 CAT_5

Print file using SED:

Print Non-consecutive lines:

Delete Lines and Print:

How to delete or remove some of the lines and print rest all the lines.

Example:

sed '3d' exp.txt

so above command delete 3rd line and print all the lines.

Output:

CAT_1 CAT_2 CAT_4 CAT_5

Inserting spaces in files:

In Place Editing in Sed:

Using the “-i” flag we can edit the file in place and changes are updated in the same file without printing output of file in the console.

Example:

sed -in 's/CAT/DOG/' exp.txt

Output:

DOG_1 DOG_2 DOG_3 DOG_4 DOG_5

Occurrences of pattern in SED:

Command S for substitution:

it will replace the occurrence of pattern to a newly given pattern

Replace String using String:

Replace String using REGEX:

Generating Unique Id in Distributed Environment in high Scale:

Search This Blog

Saturday, 23 February 2019

AWK Predefined Variables

Predefined Variables in AWK:

BEGIN and END in AWK:

Awk Tutorial for Beginners

What is AWK?

Print file using awk?

Print Column using awk?

Using printf in awk?

Comparison Operators in AWK:

Compound operation in AWK:

Next Keyword in AWK:

Variables and Numeric Expressions:

Tuesday, 12 February 2019

Basic tutorial of SED (Stream Editior) for beginners

What is SED?

How to use?

I will use the given file reference to explain the commands: for seq in `seq 1 5`; do echo “CAT_$seq” >> exp.txt; done the above command will create file “exp.txt” which has content CAT_1 . to CAT_5 separated by lines.

Delimiter IN SED:

How to print line no using sed:

Using “=” we can print line and line no: Example: sed ‘=’ exp.txt OUTPUT: 1 CAT_1 2 CAT_2 3 CAT_3 4 CAT_4 5 CAT_5

Print file using SED:

Print Non-consecutive lines:

Delete Lines and Print:

How to delete or remove some of the lines and print rest all the lines. Example: sed '3d' exp.txt so above command delete 3rd line and print all the lines. Output: CAT_1 CAT_2 CAT_4 CAT_5

Inserting spaces in files:

In Place Editing in Sed:

Using the “-i” flag we can edit the file in place and changes are updated in the same file without printing output of file in the console. Example: sed -in 's/CAT/DOG/' exp.txt Output: DOG_1 DOG_2 DOG_3 DOG_4 DOG_5

Occurrences of pattern in SED:

Command S for substitution:

it will replace the occurrence of pattern to a newly given pattern

Replace String using String:

Replace String using REGEX:

Generating Unique Id in Distributed Environment in high Scale:

I will use the given file reference to explain the commands:

for seq in `seq 1 5`; do echo “CAT_$seq” >> exp.txt; done

the above command will create file “exp.txt” which has content CAT_1 . to CAT_5 separated by lines.

Using “=” we can print line and line no:

Example:

sed ‘=’ exp.txt

OUTPUT:

1 CAT_1 2 CAT_2 3 CAT_3 4 CAT_4 5 CAT_5

How to delete or remove some of the lines and print rest all the lines.

Example:

sed '3d' exp.txt

so above command delete 3rd line and print all the lines.

Output:

CAT_1 CAT_2 CAT_4 CAT_5

Using the “-i” flag we can edit the file in place and changes are updated in the same file without printing output of file in the console.

Example:

sed -in 's/CAT/DOG/' exp.txt

Output:

DOG_1 DOG_2 DOG_3 DOG_4 DOG_5