Nice work! My only nitpick here is don't report accuracy in a vacuum because we don't know the accuracy of a dumb model that always makes the same prediction, which is what we're trying to improve upon

Reply to this note

Please Login to reply.

Discussion

Its all a moving target and accuracy is against a random 10% sample of split data before training.

The real world testing is the last couple weeks where I’m seeing validated accuracy against significant volume - I pre-filter 6,000 spam events/minute at present.

Can it be beaten, yes. Does it prevent flooding or other spam attacks? No. But content based spam like email should have some level of manageability with spam detection models.

I think something like this might be a service relay operators (mainly paid ones) woukd be willing to pay for. Pay x number of SATs for an API key.

I’m happy to chat to any relay operators who would like a service for this.

Aggregators will have the best data to build and train datasets - and detect spam sooner than relays. It’s certainly a space where one can add value.

To add value for relays, why not extend NIPs to store pictures/videos directly on the relays and client pay for storage, pay for write only.

Currently it is so hard to get content of the link from TW/YT if client is being in censorship ocean. With above function, a single sync from relay to relay makes things easy.

Definitely I see a revenue model here, especially for video hosting.