Nostr Web Client

Have you actually implemented this somewhere? (Production or testing) I’m curious to know what use cases we might expect to see in the wild in the short term if a bloom filter nip were to exist.

Vitor Pamplona 2mo ago 💬 3

Internally yes (I have not saved into events yet). All my event, address and pubkey relay hints are saved as 3 massive bloom filters. So, when the app needs to figure out which relays have given IDs, it checks the filter. Which means that I don't need to save any of the events.

Reply to this note

Please Login to reply.

Discussion

david 2mo ago

Very interesting. Are any other clients doing this? Would you envision clients sharing filters like this?

Vitor Pamplona 2mo ago 💬 2

I don't think so. Everybody just saves a huge list of relays in their databases.

There are many places clients could share bloom filters. This all started with this idea: https://github.com/nostr-protocol/nips/pull/1497

In this case, I proposed sha256 as a hash function so that clients didn't need to code MurMur3, but MurMur is so easy that we can just teach people how to do it.

david 2mo ago 💬 2

I’m reading your NIP-76. It only takes 100 bits to handle 10 million keys without any false positives?? Wow. Very cool 🤯

cloud fodder 2mo ago

👀🐳

Vitor Pamplona 2mo ago

I am not sure if that math is still good. This site can give you a better idea: https://hur.st/bloomfilter/

It's all about your probability

Vitor Pamplona 2mo ago 💬 3

I think that math was wrong. The 10,000,000 keys was not the number of keys inside the filter (which for NIP-76 would be 2-3 keys on average). But relays would have to check that filter against 10,000,000 + keys that can connect to them. The false positives claim was based on testing 10,000,000 keys against a simple filter like that.

david 2mo ago

Yeah, I suppose 100 bits would be well past all 1’s if we tried to pack in 10^7 pubkeys. If I’m understanding correctly how this works.

david 2mo ago 💬 1

So the question we ask: given a certain set of parameters, if we throw X randomly selected pubkeys at it, what are the odds of 1 or more false positives? And for 10 million it’s still pretty tiny.

Woodward 2mo ago

Yo, that’s wild! 🤔 So, if we’re tossin’ 10 mil pubkeys into the mix and the odds are still low, what’s the magic number for X that flips the script? 🧐 #CryptoMath #PubkeyMysteries

david 2mo ago

So I think I misunderstood what you meant by “capable of handling up to” a million keys. It means it would successfully defend against being attacked by one million pubkeys trying to gain access.

cloud fodder 2mo ago 💬 2

for this size of a set, the WOA scores, about 100k scores. so it's very doable to just grab the scores, either with http api or websocket attestations..

cloud fodder 2mo ago

then, if you want to bloom, you can do it on the device or service, in any way you see fit. I did think about it a bit tho, directly serving blooms, but the reality is, the scores might matter more than just a true/false type thing and, it wasn't that much savings.

Vitor Pamplona 2mo ago 💬 2

They do, and I think individual scores will always be there.

But downloading 100K individual scores takes a while, uses a lot of data and space in the disk of the phone. Having ways to minimize that usage while providing some rough level of trust enables some light clients to exist.

For instance, a user search or tagging could use NIP-50 to download all the shit on relays and then filter by local bloom filters to know which users are real ones. If bloom filters are not available, then the app needs another round trip to download the individual scores of each key and discard them all when the user closes the search screen.

cloud fodder 2mo ago

the key is really, yes, if you can fit it into the relay, like you mentioned with citrine.. but, if it's too big, well, yeah.

david 2mo ago

Sounds like there will be lots of instances where a WoT Service Provider would want to deliver a bloom filter instead of a big list.

Big lists cause several problems:

1. Unwieldy to transmit by API; even just a slight delay could result in bad UX, depending on the use case

2. Won’t fit in a single event due to size limits

3. Slows down processing when using the list for whatever the recipient is using it for.

Any rule of thumb estimates we should keep in the back of our minds as to how big a list of pubkeys or event ids should be before we should think about delivering a bloom filter instead?