in case anyone doesn't understand how taproot enabled segwit witness data to be polluted, here is an explanation written by github copilot:

The Taproot upgrade for Bitcoin, activated in November 2021, introduced several changes that indirectly allowed large blobs of arbitrary data to be stored in SegWit witness data. Here’s a breakdown of how this happened:

Background

SegWit (2017):

Segregated Witness (SegWit) separated witness data (signatures and script data) from transaction data, moving it outside the traditional block structure. This change made it possible to include more data in blocks and reduced some limitations regarding the size and format of witness data.

Taproot (2021):

Taproot is a soft fork upgrade that enhanced Bitcoin’s scripting and privacy capabilities. It introduced a new script type (P2TR, Pay-to-Taproot) which uses Schnorr signatures and allows for more flexible scripting via MAST (Merkelized Abstract Syntax Trees).

Key Changes From Taproot

1. Introduction of Tapscript

Taproot introduced a new scripting language called Tapscript, which relaxes certain limits compared to previous script types. Notably, Tapscript removes the 520-byte push size limit for individual OP_PUSH operations in witness data. This means you can now push much larger chunks of data in a single operation.

Pre-Taproot: Each push in a scriptSig or witness could be at most 520 bytes.

Post-Taproot: Tapscript allows pushing arbitrary-sized blobs (up to the overall limits of witness data).

2. No Signature Requirement for OP_RETURN-like Data

With Taproot’s changes, users can create "script-path spends" that simply push data onto the stack and return true, without requiring a signature or a specific public key.

3. Witness Data Is Still Discounted

Because SegWit discounts the weight of witness data, storing data in the witness is much cheaper (per byte) than in the rest of the transaction.

4. Rise of Ordinals and Inscriptions

Developers discovered that Taproot’s relaxed rules allow for the storage of large files (images, text, etc.) in Taproot witness data, which led to the creation of projects like Ordinals and Inscriptions. These leverage Taproot's script-path spends to encode arbitrary data as part of a valid spend.

Summary

Taproot enabled large blobs of arbitrary data in SegWit witness data by:

Introducing Tapscript, which removes the 520-byte push limit for witness data.

Allowing script-path spends that can push arbitrary data and succeed with minimal requirements.

Keeping witness data discounted (cheaper to store).

Result:

People can now inscribe large files (up to the block weight limits) into Bitcoin transactions by embedding them in Taproot witness data.

here's its' take on what would resolve the issue:

To address the problem of users pushing large, cheap blobs of arbitrary data into the witness data section of Bitcoin transactions, a protocol rule change could focus on limiting the size and/or the type of data allowed in the witness, or increasing its economic cost. Here are possible solutions, with pros and cons:

1. Reinstate Per-Push Size Limits in Tapscript

Change:

Reintroduce a maximum byte limit (e.g., 520 bytes) for each individual data push in Tapscript, similar to pre-Taproot script rules.

Effect:

Prevents single pushes of multi-kilobyte or megabyte blobs, making it much harder to inscribe large files in a single input.

Pros:

Simple, familiar limit.

Stops most large data blobs.

Cons:

Can be circumvented by splitting data into many smaller pushes.

May break legitimate advanced scripts relying on larger pushes.

2. Reduce or Remove Witness Discount for Non-Signature Data

Change:

Alter the "witness discount" so that non-signature data (i.e., arbitrary data not used for script validation) is charged at full block weight, or even at a premium.

Effect:

Makes inscribing large blobs much more expensive economically, discouraging abuse.

Pros:

Targets the economic incentive directly.

Preserves flexibility for legitimate witness uses.

Cons:

Requires distinguishing between signature/script data and arbitrary data, which may be challenging in practice.

More complex to implement and enforce.

3. Explicit Protocol-Level Data Push Limits per Input

Change:

Set a maximum total witness size per input (e.g., 10 KB), not just per push.

Effect:

Directly caps the amount of data that can be inscribed in any given input.

Pros:

Simple enforcement.

No need to parse content type.

Cons:

May restrict legitimate use cases needing large scripts or multisigs.

May push abusers to use more inputs per transaction.

4. Disallow Script-Path Spends That Only Push Data

Change:

Add a rule that script-path spends must perform some cryptographic or script validation (e.g., require the use of OP_CHECKSIG or OP_CHECKSIGADD).

Effect:

Prevents witness data from being used purely as a data dump, forcing spends to be "meaningful."

Pros:

Targets the abuse pattern directly.

Still allows legitimate Taproot scripts.

Cons:

May break certain advanced scripting techniques.

Could be hard to define "meaningful" scripts.

5. Soft Fork to Add OP_RETURN-like Restriction to Witness Data

Change:

Forbid non-payment outputs (like OP_RETURN) in witness, or set a strict size limit for such data.

Effect:

Prevents using Taproot witness as a replacement for OP_RETURN, which is already limited.

Pros:

Consistent with existing OP_RETURN policy.

Cons:

Could require complex script analysis.

May impact future extensibility.

Summary Table

Solution Stops Large Blobs? Breaks Existing Use? Economic Impact Implementation Complexity

Per-Push Limit Yes (mostly) Possible None Low

Remove Discount Yes (by cost) No High Medium

Per-Input Limit Yes Possible Moderate Low

Require Checksigs Yes (pattern) Possible None Medium

OP_RETURN Restriction Yes Possible None Medium

Recommended:

A combination of reducing the witness discount for non-signature/script data and reinstating a reasonable per-push size limit would be most effective, minimizing protocol breakage while curbing abuse.

Reply to this note

Please Login to reply.

Discussion

reducing witness discount (it should be removed IMO) and enforcing a per push limit on the mempool would fix the problem.

after that, go ahead and uncap the mempool filter on OP_RETURN, because it's same same at this point. without equalizing the cost of witness data being loaded with crap there is no point touching OP_RETURN