Bitcoin's purpose is NOT to preserve arbitrary data. Bitcoin is a game that brings special bit sequences into existence, and we use these bit sequences to emulate physical ownership via secret knowledge.
Discussion
Information never is and can't ever be scarce. Valid moves in the game are scarce, as the playing field is limited in time and space. Information can never be owned. It can only be kept secret.
The data preserved by Bitcoin is NOT arbitrary. It's deeply meaningful in the context of the game, which is bound by non-arbitrary physics.
Adding arbitrary data will always suffer the Oracle problem, which is to say "bananas on the blockchain." It doesn't solve anything.
Determining what bit sequence is valid via physical constraints is what makes bitcoin real, not a simulation. It's what makes it ELECTRONIC cash, not merely a digital construct.
We must preserve the FREEDOM of being able to use Bitcoin as a currency.
All other uses should not interfere with the currency function.
Why would the Bitcoin protocol allow anything besides strict transaction data to be written in a block? Seems absurd. What am I missing?
It is an oversight/bug that has been exploited, just that.
OP_RETURN does exactly this, intentionally. Current methods are just cheaper in terms of fees.
OP_RETURN has a limit of 80 bytes; this avoids the possibility of spamming the timechain with monkeys.
Storing huge data in a OP_IF block is not intentional at all, it is cheaper because it is effectively an exploit.
OP_RETURNs of arbitary size are consensus valid, simply nonstandard (won't get relayed by normal nodes). You could also easily design a scheme where a monkey jpeg is split across multiple transactions, too, and it would be valid AND standard, but again it would be pretty inefficient.
How does this inform the current debates? A protocol change or let market forces/ game theory work it out?
arbitrary data can be done by pBFT replication algorithms, which don't have to be secured by tokens (or can be bonded to bitcoin). I personally think it's kinda sad that Nakamoto Consensus gives us 50.0001% defense against tampering and people think that going back to pBFT 2/3rd majority is a step forward.
Also, arbitrary data needs to be segmented by type or size or geographical or conceptual proximity to be cost effective. The whole reason why Bitcoin is a global database is because it's for a Global Currency. Data is not the same. Data has a definite set of interested users and thus such replication is pointlessly inefficient.
Secret Knowledge sounds so much cooler than Password.
I still think OpenTimestamps is valuable and at such a low block space cost it's hard to complain.
Game of Bits.