The majority of Damus (likely most Nostr Clients) global feed spam comes from around 50-100 quasi template messages. Make them disappear and the noise drops to very little again.
I’ve almost finished an around 100k event labeled dataset for spam detection. The current spam by volume is Asian language biased, and the non-spam content is English biased - however my relay testing looks promising.
I’ll try do more testing this weekend with the latest events and see how it performs.
If it works well, perhaps relays can use it before accepting published events. Not sure yet how best to do continual training, however I did use event kind=1984 reports to help identify and tag spam events.