I agree it would be nice to prevent this somewhat formally, because I think there's some quite woolly thinking about "it's impossible to prevent data" without concrete analysis. Btw, even purecoin suffers also from amount fields being plaintext: though it's tough, a well funded 'spammer' can probably get a number of bytes of data on chain with a "split and then recombine a large single utxo" strategy. It's an extremely low data embedding rate on a per tx basis, but it's not nothing, assuming we succeeded in getting rid of locktimes, and pubkey and sig embedding. If we encrypted amounts we might hit the old 'zk implies randomness implies embedding' problem again.
Discussion
'present' not 'prevent'
Cool. I think a first step is just to figure out the actual data embedding rate for a few different approaches/assumptions. Has anyone even figured this out for BIP444?
Good Q. But i fear the other side of the debate has already conceded the basic point, while arguing 'contiguous' somehow matters...
I think a Schnorr based purecoin has anything from 25 to 50%+ embedding rate depending on a bunch of stuff. BLS it'd be super small. Unless we missed a trick.