At what number of utxos does Bitcoin (let's say the Core implementation) fall over and crash?

At what number does it become impossible to run with a consumer grade laptop (let's say 16GB ram and plenty of disk for argument's sake)?

#asknostr #bitcoin

(For reference we hit 180M+ utxos in the last couple of years due to runes and whatnot; utxos are not very expensive to make)

Reply to this note

Please Login to reply.

Discussion

doesn't dbcache just flush to disk when you hit your ram limit?

Does the db not have limits? Or can take any number that can fit on the disk?

The Bitcoin Core implementation only keeps a subset of utxos in memory, which by default is only 450 MB. The rest are on disk, which for this thought experiment you say we have plenty of.

Since blocks are limited to 4 MB, we will never have to exceed memory usage to process utxos in a block due to memory.

I don't think a sufficient number of utxos could cause it to fall over and crash.

Are there not db limits?

Also, with a very large amount of db on disk, what are the performance implications, in verifying new blocks? I do understand that this is greatly optimised by what you say, but, even with fast disk, at what point does it get too slow to keep up with the tip?

(interesting that if all spends are of recent utxos, we do kind of magically avoid the most obvious problems, thanks for pointing that out).

(To nostr:nprofile1qqspphpska7kts5gf5pzdk826t2j65xhvecrllwwl062qudnuue05ecpr9mhxue69uhhyetvv9ujuumwdae8gtnnda3kjctv9u9u2c5q and nostr:nprofile1qqsyp0wvpzygmxx5n5plhyall5wvgmvm3uv86zxfljqmft0s45q06tqpzamhxue69uhhyetvv9ujuurjd9kkzmpwdejhgtcppemhxue69uhkummn9ekx7mp0qyvhwumn8ghj7urjv4kkjatd9ec8y6tdv9kzumn9wshsa60yme )

I think there are ways to make validating a block very slow due to script validation, but that is orthogonal to utxo cardinality.

Both blocks and utxos are stored in leveldb, and blocks are an order of magnitude bigger on disk with no performance issues. However, value sizes for utxos are much smaller, so the number of entries is bigger.

The next release of Bitcoin Core has a change to greatly speed up disk lookups of utxos as well.

Oh for sure script validation is a more "real and present danger". I'm discussing more theoretically; the "state" is the utxo set, and theoretically you need all of it to do validation. Blocks, you don't. I think cardinality is relevant, though I'm not sure in detail, based on the size of a utxo serialization being roughly constant.

Lookups in a set aren't free, so a limit must exist somewhere, right?

Yes, I'm trying to find out what that limit would be in leveldb without much luck.

I see some GitHub issue commenters saying they operate leveldb DBs with multiple Tabs and 100s of billions of entries with no issues.

I haven't done the math yet but I think it would take decades of constant utxo spam at a 4 MB per 10 minute rate to get there.

*multiple TBs

Nice, good to know there is no trivial limit there, just from db operations. Presumably we would hit other limits. I guess this is a case where simulating on a testnet might be the way to find practical limits. Not a trivial project though!

Curious, what are your thoughts on libbitcoin nowadays? It's designed to address this and other architectural flaws of core.

I thought it was interesting but I don't know enough to offer a strong opinion. But these questions would still apply, right? Just quantitative answers would be different, presumably.

(Iirc libbitcoin is faster but more memory hungry.)

It doesnt have the concept of the utxo set or the mempool in memory, just a database of transactions where confirmation status is only one field.

It's also multi CPU core, but I'm not sure how this relates to your question...

Oh really? That sounds a bit crazy, albeit it has its logic .

The multi core part yeah, they focused on parallelisation to speed up IBD. My vague memory from Eric's explanation was that it stuck everything in RAM to make access fast but that's obviously not right.

My question is motivated by the, I hope, obvious intuition that there has to be a limit somewhere where things break, since you theoretically need access to the entire utxo set to complete validation of state updates.

The problem with this is that connecting blocks to tip once you reach steady state will be much slower. Instead of looking up each input prevout from the set of utxos, you will have to lookup each input prevout from the set of all txos ever created.

It's something me and Tadge want to test. Just spin up a {signet, regtest} network and then create infinite utxos until it crashes. Should be a fun experiment