Here is a real example of how spam detection can help reduce noise on the Nostr network.

First image as I was enabling it. Semi-staggered in two phases.

Second two images are examples of relays (not finger pointing.. just examples) where their event traffic dropped 80-90% by volume (when I dropped the spam).

Reply to this note

Please Login to reply.

Discussion

A few more examples. Including the Damus relay, which had a 52% total volume reduction.

Basically all spam detected was with 99% confidence or more only.

This stuff all adds up too. I found one specific spam event (duped, unique event_ids) I had stored 360,000 times on my relay 😟

Interesting there are a fair few relays where the traffic didn’t change at all - basically not being used by spammers (or syncing from a relay that is).

https://i.ibb.co/QbpRCRB/image.page

Fixed image link for previous.

what was nostr.mom numbers two days ago and today. (I did spam filtering yesterday.)

I don’t have the specific data, however your relay volume in general reduced significantly from two days ago. I’d guess it’s likely related to your efforts. And your relay didn’t change after I applied my spam filtering today.

And here is Nostr global feed before and after spam filtering.

Nice, actually those keywords can also be used for keyword filtering in nostr client. Unfortunately, probably most client haven't implement this basic feature yet 😅

you can turn this into a bot and publish 1984 kind, we relays or them clients can follow it!

It’s proof of concept, but I’ve added meta.spam_score to my event API. I still need to decide how to roll out the ML classifier at scale (it’s a bit slow atm without a GPU). I’ve only classified 40k events thus far to review.

I’m not sure publishing 1984 events is the best approach. Each spam message then would need another 1984 kind event. And just one spam message alone I have 360k+. But alas, it’s not too dissimilar to likes and how one post could have 10MM+ like events in the future.

https://api.nostrgraph.net/beta/events/note1da798nu47zk363rjddg37wlsgy0xw8y0w48yx0vrzfw6652tam3szpzre9.json?pretty=true

Or just stats

https://api.nostrgraph.net/beta/events/note1da798nu47zk363rjddg37wlsgy0xw8y0w48yx0vrzfw6652tam3szpzre9.json?pretty=true&stats_only=true

or you could just tag the pubkey once and let everyone know it is a spam bot... then it is 1 event.

Yeah, the issue is they often use a new pubkey per event.

Turning it into a bot will add transparency too. If a relay is saying I am following this bot then people will know that the relay is not a black box.

I like the concept. The 1984 events could have a known pubkey that could be trusted by relays or others to label events as spam.

One issue is you would lose the spam score component. At what score would an event be created? Is that something clients should instead decide? The score could be inside a 1984 tag perhaps - but scores also could change or improve over time if an event was re-evaluated (perhaps a bug, or new data now correctly identifies something as spam).

Adding the score in the event is a great idea. If NIPs doesn't support, you can later edit the NIPs..

If the event's recategorized maybe a new 1984 could replace the previous one?

This is awesome work!

How do I enable spam filter on my nostr client? Is it specific to relay?

We don’t know yet how best to deal with spam across the protocol. At present it’s easiest for relays to do something as they have the most data. However clients could use a pre-built model to detect spam themselves.

So yeah, todays it’s relay specific and clients can only really modify their relays to ones that have some protection.