Replying to Avatar brugeman

Could this be useful?

https://spam.nostr.band/spam_api?method=get_current_spam

I collect events from all relays for the last hour, group events w/ common words/ngrams, find clusters of >100 events.

This API prints the stop-words for big clusters - if event contains all of words, it's most likely spam. Relays/clients could proactively match new events against these words, or periodically delete specific events/pubkeys.

Was playing with this today, will be using in my relay. It's updated close to real-time.

Also

https://spam.nostr.band/spam_api?method=get_current_spam&view=pubkeys

https://spam.nostr.band/spam_api?method=get_current_spam&view=events - BIG!

So much work!

Reply to this note

Please Login to reply.

Discussion

You don't like it?

It’s out of spec and creates a dependency on one relay. There must be a way this is decentralized

AI based decentralized spam detection

#[3] ’s approach seems reasonably lightweight and could be added to any relay. classifying using larger models and frequently retraining them would be heavier but surely possible on larger nodes.

also, federated training is a possibility but makes everything a bit more complicated. i guess the most important question is: what features turn out to be predictive. maybe the protocol itself can be leveraged more (network/metadata).

So it's not much work, you just don't want to rely on an api?

The algorithm is simple, every relay could do it, I am just showing what I am experimenting with.

Cool #[3] I think it’s worth exploring