GOSSIP:

I have basic spam filtering via script working on a development branch. I'll need to optimize it and make pages for editing your script. But basically you create a script like this:

fn filter() {

if content.contains("AirDrop") {

0

} else if content.contains("Ukraine") {

0

} else {

1

}

}

0 means deny, 1 means allow, and 2 means "mute the author". The data available to the script (as scoped variables) inlcudes 'id' (hex string) 'pubkey' (hex string) 'kind' (integer) and 'content' (string). I may pass in 'now' (integer) too. The scripting language is pretty extensive. If people come up with good filtering ideas and need more variables passed in, I can enhance it.

I'm having a hard time figuring out how to spam block messages from random new pubkeys with randomized content. What is the algorithm to stop that? Maybe I should make an entropy calculation function, if the entropy is high it is probably not a human readable language.

Reply to this note

Please Login to reply.

Discussion

By the account previous behaviour

That’s fantastic! Our you still thinking about exposing a Lua API?

Maybe a NIP-5 boolean?

What has the performance penalty looked like so far?

Hey Mike, could we use libretranslate’s language detection somehow? I know it determines a “certainty” value, as to how sure it is what language is being used.

Might be a bit heavyweight

Interesting!

Check if the author is followed by me or at nth grade can be useful, but I suppose the latter is quite complex to achieve.

Other interesting infos to expose:

* A "first seen at" timestamp set using previous events;

* A "lasts replied at" to whitelist people you interacted with;

* NIP-05 domain, useful to ban bots that create multiple identities and try to validate themselves in this way;

* NIP-05 status;

If the filter acts on ranges (ex. 0.0 - 1.0) instead of precise values it is possible to compose the final score adding and subtracting the output of many conditions.

I suggest to assign each npub one of three possible states:

- trusted

- malicious

- unknown

Npubs the user follows or writes first to go to trusted state.

Npubs user blocked go to malicious.

Unknown are first communications and user or script or paywall should sort them somehow.