Avatar
aj
c658a027806520596e9dd1197c1e793e1bf2eef5a177441c42f50b0f05c54f48

Feel like I've had a similar experience on some throwaway sqlite db, but didn't commit the result to my long term memory...

Did you try batching the additions (begin..commit)?

O(0) algorithms ftw. There has to be a tshirt idea there. Maybe a -420+0 diff on the back?

They want to shoulder the load: replacement is the goal. Why send your PRs to an open repo and have to deal with public review when you can push people to switch to a repo you control, with the ability to both block changes you disapprove of and force through changes you like? The technical and economic disagreements are a useful justification, but they're not the fundamental motivation.

Replying to Avatar Rusty Russell

I watched the video of nostr:npub1cev2qfuqv5s9jm5a6yvhc8ne8cdl9mh459m5g8zz759s7pw9fayqnketlq's Bitcoin-Lisp-Script talk (https://brink.dev/blog/2024/12/19/eng-call-aj-towns-bll/).

Summary:

1. Lisp a the classic alternative to Forth for embedded use, so makes sense for Bitcoin Script.

2. Iteration is definitely a super power.

3. Definitely worthy of further research.

My main concern is that it requires much more broadly-defined limits. The varops work does a very limited subset of what is needed in general, *because* we operate within a 4MB max opcode limit already. varops *only* has to extend it so that large data doesn't make things worse, and we get to ignore anything which isn't proportional to data being operated on, figuring that's already possible.

An approach with iteration has to worry about more than that, requiring a general system of CPU consumption limits. That's not impossible, but I have *not* done that. For example, OP_DUP3 of three empty stack operations costs 0 varops. If you could do billions of these, this assumption that we can ignore those stack operations would be invalid.

The good news is that I am tending into that area with my OP_MULTI proposal. Prior to this, nothing operates on more than 3 stack entries, so ignore the overhead of stack operations (OP_ROLL is the exception, but in practice it's really fast, and still limited to moving 998 entries). With OP_MULTI, this needs to be taken into account, and if it can cause significant time to be spent on stack operations, the varops model will have to be extended. However, OP_MULTI is still very much limited to the stack size. To be fair, I'm considering increasing that from 1000 to 32768, because it's reasonable to have almost that many outputs (P2WPKH), so I might be forced to extend the varops model to cost that appropriately.

Now I need to go read AJ's Python code, since I have other questions about the exact nature of these limits (does space include overhead? 0 byte allocations are not free!).

So, if you're interested in extending script, I believe you should fairly consider this. I would like it to advance to a proper BIP, of course, so we could get more of an apples-to-apples comparison, and that's a lot of work!

Side note: everyone working on Bitcoin Script replacements is smarter than me. It's intimidating! πŸ’œ

Broadly defined limits: the way this is designed is that each element has a strictly defined memory usage (16 bytes overhead for cons' two pointers, 16+size bytes for vector, plus 8 bytes overhead for type info and refcount in both cases), and a strictly defined lifetime. bllsh currently tracks the limits (ALLOCATOR.alloc() and .free() and alloc_size() in Element subclasses). Actual implementations might use less space (4 byte pointers vs 8?) or more (allocate space in powers of 2?) but should only be a factor of 2 or 4 or so off.

Enforcement of memory and execution cost limits is pretty preliminary: there's checks in symbll/WorkItem/step for no more than 100k steps (with no weightings applied) and no more than 400kB of memory used. This really should be using Allocator/record_work and over_limit to ensure intermediate state didn't go over the limit, and it all should apply to bll as well, but currently doesn't.

Replying to Avatar Rusty Russell

yay! Thanks for the feedback!

> a) if the selector is meant to be scriptnum compatible, would be good to make that clear (ie 513 on the stack is interpreted as outputscript+collate not as 0x0102 and tested as outputamount+version.

This builds on the previous BIP, which changes this, but I agree it needs to be explicit.

> b) please use hex for the selector values...

Done.

c) you might consider two separate functions [op_tx and op_txcollate] instead of using a bit selector; makes it slightly easier to statically validate expected stack depth, though still not great.

Understandable, but since we have a bit selector anyway, I prefer this approach.

> d) not sure the ordering of collate-vs-noncollate makes sense - should "output" just always put the new data at the start of the buffer or the end of the stack do you consistently get the last output thing first?

Good catch!

I've switched the definition around to the order I think of it, i.e. txid order. But it means the definition of "output" for the non-collate case is a bit awkward: `"output" means "push a new element on the stack under the previous output elements" (i.e. the last output will be the top stack element).`

> e) input(num)witness seems like a bad idea to me, constraining how other inputs are spent just seems like asking for complexity and bugs.

I agree, so don't do that. However, it's possibly useful too, so it's allowed. Not like you're going to do this "accidentally".

> f) having a way to pull info from other inputs' annex (and your own) would be a win though, but maybe that should be an op_annex getter.

I was tempted to use separate bits for WITNESS_STACK, WITNESS_TAPSCRIPT and WITNESS_ANNEX for this reason, but the entire witness is more robust against future SegWit output types: I don't have to worry too hard about how to interpret those fields.

Note that getting the ANNEX is awkard, due to lack of DROPN. See the very-unfleshed-out OP_MULTI which would make this tenable.

> g) personally I don't think upgradability is a win here, better to upgrade by defining a new opcode than have weird semantics that people will forget to check.

Perhaps, but this argument seems very weak. Who will forget to check? The only possibility is someone who allows arbitrary selection_vectors in their script, and they're already kinda messed up.

> h) the ctv comparison at the end is inaccurate - ctv also commits to scriptSigs.

Oh, good catch! Fixed.

> i) automatically hashing it as you go rather than pushing the whole thing to the stack then hashing it seems worth considering

That seems like a regression to me. People have been proposing combo opcodes precisely because they haven't a decent costing model.

1. OP_SHA256 is a single opcode.

2. You can also now easily append/prepend/insert data that you want to commit to.

3. You have 4MB of stack, which is hard to exhaust (you can, using metadata, but it's difficult).

4. Hashing directly would reduce the cost from 12 varops per byte to 10, and you get 5200 varops (!) for every tx byte.

To reflect the depth of my gratitute for your feedback, I shall now review your lisp approach. Seems the least I can do! πŸ’œ

Forums or email seen better for long responses....

re: optxcollate as a separate opcode / auto hashing - I guess I'm thinking that [txcollate+sha256] could be added as a new opcode to existing tapscript without requiring varops budgeting to be applied to existing opcodes, so might be interesting to separate out and explore more immediately.

I think you could exhaust 4MB of stack with a tx spending 400 inputs each with a 10kB scriptPubKey ([1SUB DUP NOTIF p CHECKSIGVERIFY ENDIF]*250 or so for a 1 of 250 multisig maybe) and specifying input_scriptpubkey. Hashing as you go with collate would make that okay.

re: output order, "push a new element under .. last output will be on top" seems contradictory to me. The last/most recent element will be under everything which is the bottom? First/oldest output should be the top and first to be popped afaics.

You might consider moving the witness data to the end of the output and including a flag byte do that you can construct the wtxid correctly with only a single txcollate invocation.

I don't think scriptsig or witness in general is a good thing to introspection, it has too high a risk of making spending utxos together incompatible (already a problem with nlocktime). Pulling out any data requires knowing the script structure and execution path (and also requires DROPN afaics particularly if the other script is using optx or checkmultisig or opmulti), which is hard to get right and easy to get wrong, as occurred with the ctv-bitvm idea recently.

The intent with the annex is to apply some simple structure to it as consensus, so that pulling data from it can be done easily and safely. [X I ANNEX] to get the entry tagged X from the annex for input I (I=-1 for current input perhaps) would be a thought. That would imply the annex should only include a single entry per tag (not clear to me if multiple entries for a tag is desirable or not). I'm leaning towards individual annex entries being limited to perhaps 127 bytes - if you want longer, [SHA256 X -1 ANNEX EQUALVERIFY] lets you put the long thing on the witness while still committing to it.

One reason to limit how much you can introspect other inputs is the costing - in a coinjoin eg you can know in advance what the pubkey etc is but the witness could be arbitrarily large and examining it could increase your script's costs more than you might expect. That has some risk of an O(n^2) blow out, which isn't a consensus risk due to the budgeting, but still seems undesirable if it forces your costs to go up in unexpected ways that you can't really know until you see how the other guy is going to authorise.

Replying to Avatar Rusty Russell

OK, hot off the keyboard! Here are my first pass series of 4 BIPS:

1. Varops Budget For Script Runtime Constraint

2. Restoration of disabled script functionality (Tapscript v2)

3. OP_TX

4. New Opcodes for Taproot v2

https://github.com/bitcoin/bips/compare/master...rustyrussell:bips:guilt/varops

I took this week off to work on these, so I can post them to the ML and we can start proper discussion. But I figured I might as well share early drafts here! Especially since I am sometimes so "in the weeds" that I forget what questions normal developers might have, looking at all these words...

Wish you'd at least take a look at the lisp approach...

Anyway, a few quick comments on the op_tx part: a) if the selector is meant to be scriptnum compatible, would be good to make that clear (ie 513 on the stack is interpreted as outputscript+collate not as 0x0102 and tested as outputamount+version. b) please use hex for the selector values...c) you might consider two separate functions [op_tx and op_txcollate] instead of using a bit selector; makes it slightly easier to statically validate expected stack depth, though still not great. d) not sure the ordering of collate-vs-noncollate makes sense - should "output" just always put the new data at the start of the buffer or the end of the stack do you consistently get the last output thing first? e) input(num)witness seems like a bad idea to me, constraining how other inputs are spent just seems like asking for complexity and bugs. f) having a way to pull info from other inputs' annex (and your own) would be a win though, but maybe that should be an op_annex getter. g) personally I don't think upgradability is a win here, better to upgrade by defining a new opcode than have weird semantics that people will forget to check. h) the ctv comparison at the end is inaccurate - ctv also commits to scriptSigs. i) automatically hashing it as you go rather than pushing the whole thing to the stack then hashing it seems worth considering

Hmm maybe gmax is right it abbreviating as txn instead of tx which is also transmit (he says after correcting an op_rx typo)

Require a zap or pow for comments from outside your social graph, and automatically add people to your social graph (and refund the zap) when you like a comment from them?

Bonus: will make moderating your comment threads feel like watching Adam West's Batman: Pow! Zap!

Scifi, and being a small target was the intended allusion.

For ancient fortifications, would it be referring more to a drawbridge or a portcullis?

The enemy's gate is down.

No I mean what real goal are you trying to achieve here that spam is making difficult - like presumably it's possible that we drop the pr entirely, but that doesn't actually do enough and there still ends up being "too much" spam. Or maybe the pr does go in and it turns even with it, there isn't a market for this nonsense and bitcoin doesn't end up with "very much" spam. What's the "business goal" that's affected by success or failure here, as far as you're concerned?

(For me the primary metric is "can a cheap node keep validating the blockchain", and to a lesser extent "is the long term tx fee rate level and its variability reasonable". I could probably put numbers on the first, though the layer is, at best, "i know it when i see it")

The first two of your metrics above have been met in the past month, I believe; is there some way in which bitcoin had been unusable for you in the past month?

The only (informed) argument I see for that position is something along the lines of "this acts as an endorsement that such data storage is fine, and will encourage people to do more of it. You can observe that this effect has already happened with a large increase in op_return activity since the pr was opened, and the creation of a new "op20" token scheme. That endorsement will flow over to schemes that use witness data, increasing their use as well. That will make it less likely that will have non-full blocks (due to more token usage of either type) and increase the data storage requirements of non-pruned nodes due to increased witness usage"

Personally, I think the immediate observable effect is due solely to the publicity the issue has generated and will disappear when attention moves elsewhere.

Replying to Avatar aj

"You can reduce spam by discouraging it" is a question of incentives, not one of fundamentals. I don't think we've finished the fundamental debate that you wanted yet.

Do you agree that core devs in general are trying to make bitcon work as a monetary network with a similar/compatible definitively to yours? That is, that we are actually on the same side here, even if we have different thoughts on how best to achieve that goal?

Do you accept that reasonable people can disagree on what constitutes "spam"? Both in thinking things you think aren't spam actually are, and in thinking stuff you think is spam is fine?

Do you accept that bitcoin is not your network, to rule as you wish, but that rule changes require near unanimous consent, even from the reasonable people who disagree with you, and potentially even from people doing the spamming?

Are you trying to ban spamming from bitcoin entirely?

Are you merely trying to discourage spammers? Are you aware that that might have no effect, or the opposite ("how dare you tell me what to do, I'll do it to spite you")?

Do you believe that going into technical details on this topic, to work out how best to make bitcoin sustainable as a monetary network in the long term is a valuable thing to do, or is that just a way for experts to attempt to close down the discussion and exclude non-experts?

I notice you're gong straight back to personal attacks ("would you rather people question your competence rather than your intent") and justifying it with fairly specific thoughts about current approaches, rather than sticking with the fundamental principles you claimed to want to discuss.

Oh, a question I missed. What's your goal that makes you opposed to spam, rather than being willing to merely ignore it? Or what's the success condition that would let you happily move on from this issue?

Some might be:

* you're sure your node will keep working for 10 years on current reasonably priced hardware?

* the mempool is generally empty and blocks are rarely full?

* that you feel like you've taken a stand and sent a signal, even if 99% of blocks are 99% spam and fees are sky-high? ("they may have won the war, but i bloodied the Kaiser's nose, that's for sure" eg)

* it's cheaper to pay for a coffee on chain than it is to inscribe a jpeg?

* that the spam and transfer tx fee markets are decoupled somehow, so that any spikes in people wanting to spam the chain don't cause your fees to spike, even though more monetary txs will still cause your fees to spike? (Or have less effect than they do today, rather than none)

* nobody on the internet is ever wrong ever again?

* something else?

Replying to Avatar jimmysong

> I realise it's not pleasant to put something you hate (data spam) and something you like (other people using bitcoin as a MoE) into the same category, but as far as technical impact on your node's operation, they really are fundamentally the same, affecting your node in the same way, subject to the same global limits, impacting the same global marketplace in the same way (ie the blockspace fee market), having the same solution (move the substance off chain so you're mostly not subject to the limits or fees; eg via lightning).

I agree with everything until that last clause. You can reduce spam by discouraging it. That's what we've been doing with non-standard OP_RETURNs and non-standard scripts and below-dust-limit outputs, which you don't have to do with legitimate transactions. I actually really like your analysis and the commonality in reducing on-chain transactions is some sort of layer two which settles at layer 1. That's a useful insight.

Where I think the reasoning goes wrong is that spammers don't have the same goal as you, to reduce their on-chain footprint and use off-chain protocols to make things efficient. The spammers specifically *want* to spam the chain and have been shown to be malicious in that regard, purposefully trolling Bitcoiners (like the stamps protocol, for example). So in a sense, thinking about making their protocols efficient is a waste of time. I really don't think they're interested because they could do the same stuff for way cheaper on almost every altcoin.

> Ceasing to forbid/discourage something isn't the same as encouraging it, and the continued implication that this is all about devs wanting to encourage spam continues to be unappreciated, unfair and incorrect.

That is the end effect, though, isn't it? Whatever your intentions may be, the end result is that more spam comes onto the network. Or maybe you'll deny this, in which case, we'll devolve into discussions about what might happen. But ultimately, I'm not inside your head and can't determine what your intentions are, I can only judge the practical effect this has. You said that you're prioritizing Bitcoin as a monetary network. The PR deprioritizes Bitcoin as a monetary network and your response seems to be that your intentions aren't to do that. OK, I guess? Would you rather have people question your competence instead of intent? Because the result is as clear as day that we'll see more spam and not less as a result of the PR. And when people like me say that, we get a lot of anger, insults and eventually, speculations about boogeymen that will do something worse if we don't agree to this PR.

"You can reduce spam by discouraging it" is a question of incentives, not one of fundamentals. I don't think we've finished the fundamental debate that you wanted yet.

Do you agree that core devs in general are trying to make bitcon work as a monetary network with a similar/compatible definitively to yours? That is, that we are actually on the same side here, even if we have different thoughts on how best to achieve that goal?

Do you accept that reasonable people can disagree on what constitutes "spam"? Both in thinking things you think aren't spam actually are, and in thinking stuff you think is spam is fine?

Do you accept that bitcoin is not your network, to rule as you wish, but that rule changes require near unanimous consent, even from the reasonable people who disagree with you, and potentially even from people doing the spamming?

Are you trying to ban spamming from bitcoin entirely?

Are you merely trying to discourage spammers? Are you aware that that might have no effect, or the opposite ("how dare you tell me what to do, I'll do it to spite you")?

Do you believe that going into technical details on this topic, to work out how best to make bitcoin sustainable as a monetary network in the long term is a valuable thing to do, or is that just a way for experts to attempt to close down the discussion and exclude non-experts?

I notice you're gong straight back to personal attacks ("would you rather people question your competence rather than your intent") and justifying it with fairly specific thoughts about current approaches, rather than sticking with the fundamental principles you claimed to want to discuss.

What if what you think is a fundamental disagreement is just a lack of understanding? Would it be better to spend time talking and listening, rather than developing a long term schism or grudge?

Anyone have an android nostr client to recommend for a (roughly) twitter-like experience?