Nostr Web Client

AFAIK these are “chopped up” into segments or no?

No, check out that transaction. The limits for script witness are currently 4MBs by consensus and 400kBs by standardness.

(Please correct me if I'm wrong, but this is my read of it)

https://bitcoin.stackexchange.com/questions/117594/what-are-bitcoins-transaction-and-script-limits

Reply to this note

Please Login to reply.

Discussion

nout 3mo ago 💬 2

And if I'm correct, then the whole Knots CSAM argument is just fear mongei, right?

Ferris Bueller 3mo ago

I didn't see anything in your link that told me what I needed in order to see witness data and I've never tried to do it so I used an LLM and this is what it told me I needed to extract the data:

-bitcoin-cli (to dump the raw transaction)

-jq or another parser to extract witness fields

-xxd or Python/Go/Rust to reassemble hex into binary

-Knowledge of file formats / scripting skills

Here is for OP_Return:

- copy / paste hex to binary with one command

One seems harder to get as a normie...

Also, as a knots node, I am not forced to send data I don't want to participate in. If it ends up on the chain anyway, atleast I wasn't complicate in propogating it

nout 3mo ago 💬 1

The effort is almost the same, for reading OP_RETURN using bitcoin_cli the script would look very similar - i.e. it reads the bytes following "OP_RETURN" (i.e. "6a" in hex), for ordinals/inscriptions the script reads the bytes that follow "ord" (i.e. "6f7264" in hex) and it skips the "4d" bytes every 520 bytes...

Of course people will create viewers like mempool space that make this much easier. Same as they did for the data in witness/inscriptions: https://ordiscan.com/tx/eadfab095bfe5bcfde6024126ef139ac0e98f19dcb4c8fc4e96bfa87cc39141e

Ferris Bueller 3mo ago 💬 1

The first paragraph doesnt seem very compelling. Im relatively technically compitent and that doesnt seem straight forward yo me but maybe we can agree to disagree there.

The second paragraph however is the most compelling thing I have seen yet. Thank you for sharing that. But this is still a website on the clear net that obviously cant host csam on it. So a normal person running a node couldn't use that website to see csam if it was there. It would then only be easily seen with OP_Return on their node

Am I missing something or do you disagree?

nout 3mo ago 💬 1

For the first paragraph - The main argument here is for bitcoin node runners - and that assumes that you are running a node. And I think most node runners are comfortable with running bash one-liner.

For the second - When you are running a node you usually also run local explorers connected to your node (e.g. mempool space is built in in most node packages, like Umbrel), in the same way there are ways to run ordinals explorer locally.

The website that I shown is using it's own local bitcoin node to read from and if there is CSAM in inscriptions (nothing is preventing that), then that node also has CSAM and there is a chance that that clearnet website would show it, unless they implemented some strong system to do analysis and filtering of CSAM before it gets displayed.

It's important to note that CSAM is utterly disgusting, but nothing in this discussion seems to be making any difference (or even being related to) to actually preventing child abuse.

Ferris Bueller 3mo ago

I pulled up my mempool instance in start9 and I'm looking at block 913983. I'm looking at the witness data and I do not see 6a or 6f7264 anywhere. The fact that you needed to tell me what too look for I feel like is proving my point haha. The witness data is crazy long and would overwhelm most people. Isnt that because its split up like you are saying every 520 bytes?

I run a start9 so I don't see ordiscan on their registry or community one. I didn't even know there was another one besides mempool.space.

And I mostly agree that is not stopping CSAM or child abuse for that matter. I just dont see why Core decided node runners should be required to relay it and store it on their own computers. Why do they want to give another place to put spam, that I would argue is easier to see for a non tech person? I don't want to facilitate spam because I just want it to be money. I understand the UTXO bloat argument but if the spam is in op_return it would cost more anyway so idk why they would use it unless to make a statement with something explicit or to attack the network.

Do you give any volitity to the idea that if somebody put malware in the OP_Return that it would trigger antivirus on nodes around the network? Especially cloud ones for example? I understand they wouldn't be executable but I'm hearing it could still set them off. Thoughts?

Toxic Bitcoiner 3mo ago 💬 1

So then is it that the level of effort required to contort that back into a picture is high enough that it’s not a legal problem for node runners? The effort required is (significantly) lower if it’s in opreturn? Vaguely heard about this.

nout 3mo ago 💬 4

In both cases it's a bash script one-liner to extract the data from the chain. Not really practically different.

Again, someone please correct me if I'm wrong, but this really makes the whole Knots CSAM argument just plain stupid scare tactics...

Toxic Bitcoiner 3mo ago

I think two other commenters here corrected this

nout 3mo ago

Yeah, from the other comment I learned that there is a sequence "4d" inserted every 520 bytes, but the ways to retrieve this data are the same for almost all intents and purposes.

So is the claim that there is a law that specifically defines that image has to be stored in continuous bytes (without any extra bytes inserted) to be considered CSAM?

nout 3mo ago

(correction, the inserted sequence is "4d0208" in most cases, because it includes the length)

JackTheMimic 3mo ago 💬 2

This has nothing to do with laws. It is a malware detection issue because of the contiguous state of the data.

nout 3mo ago

The one topic that has been heavily discussed above was CSAM - there it's about laws.

About malware - continuous or not doesn't really have effect here, right? The way how malware is detected is via detecting fingerprints (short byte sequences) that are common to the malware or via heuristic analysis, or via some other tricks (like running the code in sandbox). So if there is malware stored in inscriptions or in op_returns would be detected the same way with probably the same results. But taking a step back - this isn't even a real concern, since node-to-node packets are now encrypted, right? (since BIP 324 is enabled by default in Core)

So what is exactly the argument about malware in this case?

JackTheMimic 3mo ago

The argument is validation. When validating those bytes are processed in the clear decrypted. (XOR doesn't change this) also, the finger prints for malware detection are more like heuristics. Does this have 1 fingerprint? Yes, check for 2. And so on. (This is obviosly so the antivirus ALSO isn't running those same contiguous bytes through its processor).

As far as CSAM goes I am fairly certain (though I don't dare try it) that if you run those bytes through something as basic as VLC you will get an image. No additional encoding necessary. Meaning you are storimg a raw data file of terrible shit in ram that you could conceivably render with simple image software NOT some inscription decoding taproot wizard shit.

The malware is an attack vector, the CSAM is more or less gross despicable shit. All enabled by 100kb OP_RETURNS easily propagated through the gossip network.

JackTheMimic 3mo ago

Yeah, see how it uses "OP_PUSHDATA2" over and over again? That makes the data non-contiguous because the there is a byte limit when you push data on the stack. (Or else the script doesn't know where your transaction ends) this is not 100kb CONTIGUOUS data. Meaning it IS chunked up.

nout 3mo ago

Ok, so the main difference is that in witness encoded data you have "4d" inserted every 520 bytes, but it only costs 1/4 of fees?

JackTheMimic 3mo ago

Well, not exactly. Contiguous bytes that amount to CSAM, or malware in general would cause people who run a VPS node to probably have their VM shut down. That's probably 20% of the node netwprk, including major players like exchanges and mining orgs. THAT is the attack vector. Then the mitigations would be effectively having VPS providers whitelist Malware (which they won't). Having the pushbytes allows the data to be segmented and it won't trip malware detection. Nothing to do with the fees.