As we've discussed, the decentralized web depends on linked data structures. Let's explore what those look like.
A Merkle tree (or simple "hash tree") is a data structure in which every node is hashed.
+--------+
| |
+---------+ root +---------+
| | | |
| +----+---+ |
| | |
+----v-----+ +-----v----+ +-----v----+
| | | | | |
| node A | | node B | | node C |
| | | | | |
+----------+ +-----+----+ +-----+----+
| |
+-----v----+ +-----v----+
| | | |
| node D | | node E +-------+
| | | | |
+----------+ +-----+----+ |
| |
+-----v----+ +----v-----+
| | | |
| node F | | node G |
| | | |
+----------+ +----------+
In a Merkle tree, nodes point to other nodes by their content addresses (hashes). (Remember, when we run data through a cryptographic hash, we get back a "hash" or "content address" that we can think of as a link, so a Merkle tree is a collection of linked nodes.)
As previously discussed, all content addresses are unique to the data they represent. In the graph above, node E
contains a reference to the hash for node F
and node G
. This means that the content address (hash) of node E
is unique to a node containing those addresses.
Getting lost? Let's imagine this as a set of directories, or file folders. If we run directory E through our hashing algorithm while it contains subdirectories F and G, the content-derived hash we get back will include references to those two directories. If we remove directory G, it's like Grace removing that whisker from her kitten photo. Directory E doesn't have the same contents anymore, so it gets a new hash.
As the tree above is built, the final content address (hash) of the root node is unique to a tree that contains every node all the way down this tree. If the data in any node were to change even by a single byte, the hash of the changed node would change, as would the hashes of all of its parent nodes.
In case you haven't noticed, this means that as a programmer you'll always need to build these data structures backwards, from the leaf nodes on up to the root node.
DAG is an acronym for Directed Acyclic Graph
. It's a fancy way of describing a
specific kind of Merkle tree (hash tree) where different branches in the tree can point at other branches
in the tree in a single forward direction, as illustrated by the image above.