Subnostr

Explain how Git itself is a Merkle Tree/DAG.

Sure, I'd be happy to explain how Git itself can be considered a Merkle Tree/DAG.

In Git, each version of a file or folder is represented by a unique hash value. These hash values are created based on the contents of the files/folders and some metadata (such as the author of the changes, the commit message, etc.). When you make changes to a file or folder in Git, a new version is created with a different hash value that corresponds to the updated contents.

These hash values are used to build a Merkle Tree/DAG in Git. The tree is a tree-like data structure where each leaf node represents a file or folder version, and each non-leaf node represents the hash value of its child nodes. This means that the hash value of a folder depends on the hash values of all the files/folders contained within it.

This Merkle Tree/DAG structure in Git allows for efficient version tracking, as users can easily see which files/folders have been modified and view changes between versions. It also provides a way to ensure the integrity of the repository, as any changes made to the files/folders will result in a different hash value and therefore a different version in the Merkle Tree/DAG.

Overall, the use of Merkle Tree/DAG structure in Git allows for efficient and secure version tracking, making it a popular choice for software development teams and other collaborative projects.

Reply to this note

Please Login to reply.

Discussion

Colby Serpa 2y ago

Explain the paper below simply so other bitcoiners can understand, but explain by writing tweets rather than large paragraphs. They understand hashing etc but don’t know a lot about cap theory.

Paper:

Abstract. Nostr is never going to be able to replace a layer 1 blockchain as a synchronized database or ledger. Although, we can use the layer 1 bitcoin blockchain to keep Nostr 2.0 relays in sync.

Nostr 2.0 may be able to provide secure off-chain data storage as a Layer 2 atop bitcoin similar to how Lightning provides instant off-chain payments as a Layer 2 atop bitcoin.

This paper will elucidate how Nostr relays can synchronize their data while maintaining the lightweight nature of Nostr that lets users optionally delete data, something a layer 1 blockchain should not provide. It may also be cheaper for users to store tons of data with Nostr relays instead of in the bitcoin blockchain because of the limited capacity and speed of bitcoin blocks.

The simple computer science design below improves the distributed properties of the Nostr network under the standardized criteria known as the CAP theorem. CAP stands for Consistency, Availability, and Partition tolerance.

Nostr Relays Don’t Know When a Profile is Incomplete. Relays lack Consistency (C in the CAP Theorem).

Consistency means the database synchronized across various computers is identical. Nostr relays cannot synchronize their data in a trust-minimized way, like a blockchain does block by block. Unlike bitcoin full-nodes, the database Nostr relays store is often incomplete. Nostr relays have no means of discovering what data is missing, besides blindly requesting all posts signed by a specific user’s signature.

Nostr’s Consistency Problem (C in the CAP theorem)

Nostr’s Consistency/Synchronization Problem: if two users upload their individual posts to different Nostr relays, then the two users might not be able to see each other’s posts because Nostr isn’t like a blockchain. In a blockchain, all the full-nodes keep the blockchain in sync every time there is a new entry. All the full-nodes add that data, in the form of a block, to their blockchain in unison. Every full-node on the bitcoin blockchain has the exact same blockchain.

Consistency Only Occurs If Users are Connected to Mutual Relays

If we want Nostr users to always be able to see each other’s posts, then all the Nostr relays need a way to identify what data is missing from user profiles so that they can request the missing pieces from other Nostr relays or users.

Syncing Nostr Relays with Weekly On-Chain Merkle Roots & Whole Tree Hashes.

1. Once a week or so, a user can arrange ALL their posts into a Merkle tree.

2. Each leaf in the Merkle tree contains a hash of a post, just like in bitcoin where each leaf contains a hash of a transaction.

3. Once a user arranges their entire profile into a Merkle tree, they will post the Merkle root on-chain in the OP_RETURN underneath a normal bitcoin transaction. This is why Nostr 2.0 does not need to hardfork the blockchain to work. The OP_RETURN is a section underneath all bitcoin transactions that allows for small notes to be attached to transactions before they’re signed by the sender.

4. Additionally, the user will take a hash of the entire tree and upload it on-chain with the Merkle root (in the OP_RETURN). The Merkle root is only a hash of the top branches, not the entire tree. The entire tree hash is vital to giving users and relays the ability to detect when profile data is missing.

Merkle Root Hash (Hashing Hash12 and Hash34 TOGETHER)

5. To obtain the whole tree hash, simply put the Merkle root at the top of a text file. After that, put the Merkle branches on the lines underneath the root. After that, put the Merkle leaves on the lines underneath the branches. Once the tree is arranged as described, hash all of it at once. An example of what whole tree hashing looks like is seen below — it is a Whole Tree Hash of the merkle tree seen above.

Whole Tree Hash (hashing all merkle tree data AT ONCE)

The Merkle Root and Whole Tree Hash allow for 2 key functionalities:

• Merkle roots grant users and relays the ability to download one piece of a profile at a time, like being able to download a transaction without downloading the entire block.

• Whole tree hashes let users and relays know when a profile they are storing is incomplete. The whole tree hash only matches if you have every bit of data in the Merkle tree, unlike Merkle roots.

This inexpensive method can be used to update a user’s entire profile once a week, or however often they like. Nostr still works without this, as it does now, but a user can pay a few sats infrequently to synchronize their data across Nostr relays if they want all users to see their posts.

Users and relays can download posts one branch at a time. After each branch, they hash the branch with another branch nearest the Merkle root to check if it matches the on-chain Merkle root (like SPV). If the branches hashed together matches this Merkle root, then they’ll know the branch is part of the user profile even if they don’t have the entire user profile yet. Users can download different branches of the same profile from many different Nostr relays while still verifying that each branch is valid and that the profile they downloaded is complete.

Downloading one branch at a time prevents delay attacks that could cripple many distributed networks, which is why Merkle roots and branches are used in the bitcoin whitepaper to secure SPV lightwallets.

Why Can’t Merkle Roots Do What a Whole Tree Hash Does?: If a Nostr relay only relies on Merkle roots, then they will not know when the Merkle tree is complete because every pair of branches nearest the Merkle root hashes into the same Merkle root.

To be sure the user’s profile is complete, relays or users hash their entire updated Merkle tree to verify that it matches the whole tree hash on-chain. If the whole tree hashes match, then the user data is complete. If the whole tree hash does not match, then the relay or user can tell other relays what their latest leaf number is and request the missing branches until the whole tree hash does match. To keep track of all the new merkle roots added every week or so, nostr relays must become bitcoin full-nodes. Nostr 2.0 relays are indirectly paid to store the bitcoin blockchain, strengthening the security of Bitcoin and Nostr simultaneously.

Limits of Nostr Storage: Rule of Thumb for User.

There’s a chance Nostr relays may lose some user data since relays have the freedom to choose what they want to store, unlike bitcoin full-nodes. Therefore, users should only store data on Nostr relays if users can back it all up locally. Web5’s self-hosting service may allow users to sync their backups across all their local devices, so that will reduce risks for users wary of using Nostr. At the end of the day, the blockchain is the only place where data is truly immutable. Although, Nostr is a decently secure hybrid that will still work well for many applications. The trade-offs are listed below:

Three Layers of Trust-minimization:

• Immutable & expensive data storage on layer 1 that’s very difficult to censor. (on-chain blocks syncing all bitcoin full-nodes in unison)

• Mutable & inexpensive data storage on layer 2 that’s moderately difficult to censor. (off-chain merkle trees & on-chain hashes syncing nostr relays on a need-to-know basis)

• Local data storage synced across all your local devices that’s easy to censor. (centralized locally)

📜The fundamental trade-off between a Nakamoto Consensus Blockchain and Nostr:

The more Nostr relays there are that store a specific address’s data, the harder it will be to censor that data. This means popular data hosted by many Nostr relays may be harder to censor than unpopular data that is rarely downloaded.

On the other hand, Nakamoto Consensus blockchains prevent the censoring of data based on its age. The more time data has been in the blockchain, the harder it is to remove with a 51% attack.