sauravomar: March 2019

Merkle trees also know as binary hash trees which allow secure verification of the content of large datasets. It's named as per the name of inventor Ralph Merkle, It's patented on 1979 by him. It uses some of the most common hashing functions like MD5, SHA-3, and SHA-256 etc to verify the datasets.

Using hashing functions they provide unique values for the datasets which are similar, so whenever there is a change in data sets it will give new values so that you can verify that this dataset is changed.

How does Merkle Tree work?

Merkle trees have two at most two children.
Merkle trees can be created recursively or repeatedly taking the hash of its children from the bottom up and keep continue until we have only one node called Merkle Root or Root Node.
First, it creates a hash of all the files or datasets individually and then it combines hashes of two datasets and creates another hash of that recursively it creates a root node.

Let us see with some of the examples:

we have 4 files 1.txt, 2.txt, 3.txt, and 4.txt.

Let’s calculate the hash of all the files using the command md5 command bash.

md5 1.txt

MD5 (1.txt) = 9d59f1f9f080fc9e51be0699d82e8af9

md5 2.txt

MD5 (2.txt) = d41d8cd98f00b204e9800998ecf8427e

md5 3.txt

MD5 (3.txt) = d41d8cd98f00b204e9800998ecf8427e

md5 4.txt

MD5 (4.txt) = d41d8cd98f00b204e9800998ecf8427e

for the first level (leaf)we have calculated the hash, now we calculate the hash of the second level.

MD5 (1.txt) = b026324c6904b2a9cb4b88d6d61c81d1

MD5 (2.txt) = 26ab0db90d72e28ad0ba1e22ee510510

MD5 (3.txt) = 6d7fce9fee471194aa8b5b6e47267f03

MD5 (4.txt) = 48a24b70a0b376535542b996af517398

Similarily for other levels:

md5<<<”b026324c6904b2a9cb4b88d6d61c81d126ab0db90d72e28ad0ba1e22ee510510"

Hash(1.txt)+ Hash(2.txt) = f5efc447b0f95a0b5510699ea58812fc

md5<<<”6d7fce9fee471194aa8b5b6e47267f0348a24b70a0b376535542b996af517398"

Hash(3.txt)+ Hash(4.txt) = e9fea713e28bc38dd48121b7788db106

Similarly for the root:

md5<<< f5efc447b0f95a0b5510699ea58812fce9fea713e28bc38dd48121b7788db106

Hash(1.txt)+ Hash(2.txt) + Hash(3.txt)+ Hash(4.txt) = e9fb3547e97c4ca95127a07eaac153c5

Now we learn how the Merkle trees work.

How to detect changes in files?

If any of the content of the file changes so accordingly hash values changes for the root then we compare hash values for the child. Let’s say 1.txt is updated. So the hash value of the root is changed then we come to the child and check the values of the hashes and compare from the previous values than values of the node Hash(1.txt) + Hash(2.txt) is changed , now we compare the hash values of 1.txt and 2.txt file we get the updated file.

Here we can see that we don’t have to compare all the files we just need to check the (logN) files to detect the changes using the property of a binary tree.

Major benefits of Merkle Trees:

They provide a means to prove the integrity and validity of data
They require little memory or disk space as the proofs are computationally easy and fast.
Their proofs and management only require tiny amounts of information to be transmitted across networks.

Merkle trees are used in the Apache Cassandra database to detect changes across different nodes in a cluster so that all the node quickly sync their values to all the other nodes.

Its also used in BlockChain Technology like Ethereum and Bitcoin etc;

That's ’it.

Will share some more concepts till that time Good Buy.

Happy Learning.

PS: Feedbacks are always welcome.

sauravomar

Monday, 11 March 2019

What is the Merkle Tree?

How does Merkle Tree work?

How to detect changes in files?

Major benefits of Merkle Trees:

Generating Unique Id in Distributed Environment in high Scale:

Search This Blog