Replying to Avatar florian

With a similar idea in mind, I have experimented with bloom filters on my blossom server implementation.

https://almond.slidestr.net/_bloom

there is also a serverside check to debug:

https://almond.slidestr.net/_bloom?test=3deb9daf761c8d638f8ccda3523113b4eddbcf3391c469e8b12da53aa83bf47d6

I haven't really used it yet but my idea is to have "edge blossom servers" that proxy to "backend blossom servers" and the filter would be a great optimization if the proxy had knowledge where to find a specific blob. It could also be used in clients, to speed up failover to other blossom servers.

Have to look into those binary fuse filters now.

how do you handle servers removing blobs? does it need to recalculate the whole bloom filter?

Reply to this note

Please Login to reply.

Discussion

Yes it currently does. I think a separate filter for added/removed blobs can make sense. I currently recalculate on every request but that works only because I have a manageable 1200 blobs which are all in memory.

I have also added a binary fuse endpoint now, whose data is noticeably larger but that is due the much lower error rate:

https://almond.slidestr.net/_fuse

Do you have a link to the repo? Id like to continue looking into this and maybe find a way to write a BUD for it

cuckoo filters are similar to bloom but allow you to remove items