Grokking Merkle Tree

John Ferrolino

3 Mar 2021

â€¢

• Scala

Grokking Merkle Trees in Scala

There are quite a few literatures that talk about this data structure by people who, the same as I, came from their early introduction to the blockchain technology and bitcoin.

This article is not trying to replicate the effort as some posts were in-depth in their walkthrough on the usage and importance of Merkle Tree in the blockchain space and other use cases. This is the reference I used that inspired me to write this article.

In this post, we will write an implementation of this data structure using the Scala language. The approach I will take is first to first a)define the data types then (b)implement the algorithm for Merkle tree. The simplest representative of this data structure is a binary tree where the input data are used to generate our leaf nodes that contain the cryptographic hash of a dataset which will continue iterating to create branches until it reaches the last node that we will call the root.

Abstract Data Types

We will define the Node as a trait where other types will refer from. It will contain a type parameter that represents the data type of the hash value which is typically be represented as an `Array[Byte]`.

``````trait Node[T] {
val hash: T
}
``````

The structure of the tree is composed of:

• a Leaf which is a node that does not have any children nodes and stores the hashed value of the input data.
``````case class Leaf[T](datum: T) extends Node[T]{
override val hash: T = ???
}
``````
• a Branch which is the node that is produced as a product of the hash value of its children nodes.
``````case class Branch[T](left: Node[T], right: Option[Node[T]]) 	extends Node[T] {
override val hash: T = ???
}
``````

As seen above, we are missing the computation for producing the hash value. We needed to provide the mechanism to generate this hash value through defining the hash function. We will then add another parameter in the case class definition written above with this:

`(implicit hashFn: (T, Option[T]) => T)`

This will be used both on the Leaf and Branch nodes whilst the second value parameter in the function literal is left optional.

To complete the definition, let us rewrite them and add the logic for the computation as follows:

``````case class Leaf[T](datum: T)(implicit hashFn: (T, Option[T]) => T extends Node[T] {
override val hash: T = hashFn(datum, None)
}

case class Branch[T](left: T, right: Option[Node[T]])(implicit hashFn: (T, Option[T]) => T) extends Node[T]{
override val hash: T = hashFn(left.hash, right.flatMap(r => Some(r.hash))
}
``````

We now have completed the abstract data types for this data structure. The other remaining thing to do is to work on writing the algorithm for our Merkle Tree.

Implementing the Merkle Tree

The merkle tree will be defined as a class with a companion object to hold the construction of the tree.

``````class MerkleTree[T](val root: Node[T])
``````

The gist of the algorithm will be written in its companion object. It will be composed of two parts:

``````- the constructor that accepts the input data of some type T and generates the leaf nodes that will store the hash value of the input data.
``````
``````object MerkleTree {
def apply[T](data: Seq[T])(implicit hashFn: (T,Option[T]) => T): MerkleTree[T] = ???
...
}
``````
``````- the recursive method that will keep on creating branches until a single  node is left. This node will represent the root of the Merkle tree.
``````
``````object MerkleTree {
...
def build[T](nodes: Seq[Node[T]])(implicit hashFn: (T, Option[T]) => T): MerkleTree[T] = ???
}
``````

We also need an outlet to plug in our function for generating the hash for our inputs which will then be used in the data types.

Define our constructor, `apply`

When creating an instance of the Merkle tree, the apply method will be called. It requires data as input which is typically an array of bytes.

The apply method will use those input data to generate our leaf nodes which will store the hash content. It will look something like below:

``````	def apply[T](data: Seq[T])(implicit hashFn: (T, Option[T]) => T): MerkleTree = {
val withLeaves = data.map(Leaf(_))
build(withLeaves)
}
``````

After generating the leaves, it will then pass this to `build` which will complete the construction of the Merkle tree.

Create our tree with the `build` method

The build method completes the creation of our Merkle Tree where it will recursively call itself until a single node is left, the root.

Whilst in the process, the sequence of nodes are paired from leaves and internal nodes (i.e. branch) to instantiate our `Branch` case class. The pair serves as an input to our branch to represent either the left or an optional right node.

``````	def build[T](nodes: Seq[Node[T]])(implicit hashFn: (T, Optional[T]) => T): MerkleTree = {
else {
val withBranches = nodes.grouped(2).map {
case Seq(l,r) => Branch(l, Some(r))
case Seq(a) => Branch(a, None)
}.toSeq
build(withBranches)
}
}
``````

How to use it?

The first thing to do is to define our cryptographic hash function implementation. For this, we will use the built-in MessageDigest utility available in the jdk with `SHA-256` encoding.

This function literal will be used and injected to the classes and methods that require it. One way of writing it will look like as follows:

``````	implicit val hashFn: (Array[Byte], Option[Array[Byte]])=> Array[Byte] = (byteArr, opt) => {
val cc = opt match {
case None => byteArr
case _ => byteArr ++ opt.get
}
java.security.MessageDigest.getInstance("SHA-256").digest(cc)
}
``````

To prepare, we can define a sequence of string as inputs. Since our hashFn expects an input of Array[Byte], we need to convert each items to its bytes equivalent.

It will be written as shown below.

``````val inputs = Seq("trans01", "trans02", "trans03").map(_.getBytes)

val tree = MerkleTree(inputs)
``````

For more details on how to use this, visit the test spec.

What's next?

So far, we only built a data structure in the most simplest way we could. This will help us understand the components being used as well as the algorithm to complete it. This is a good introduction to understanding the Merkle tree but far enough for practical use.

A follow up article(s) will ensue on topics for applying Merkle proofs and further improvements.

John Ferrolino

See other articles by John

Related jobs

See all

WorksHub

CareersCompaniesSitemapFunctional WorksBlockchain WorksJavaScript WorksAI WorksGolang WorksJava WorksPython WorksRemote Works

For companies

hello@works-hub.com

Ground Floor, Verse Building, 18 Brunswick Place, London, N1 6DZ

108 E 16th Street, New York, NY 10003