i could so easily make a badger based store that can do this on a http endpoint with an api for "by blob" and "by pubkey"

this is the thing

the nostr event structure is practically a file metadata... even gives you arbitrary tags to add extra things to filter it with

like nostr:npub12262qa4uhw7u8gdwlgmntqtv7aye8vdcmvszkqwgs0zchel6mz7s6cgrkj the biggest problem with the filter query protocol is the lack of pagination

i could even think of a way to fix this by adding a new envelope type that connects to a query cache

so, you send query, the relay scans for matches, and assembles a cache item, which contains all of the matching event IDs plus the filter in a queue item

this item is stored in a circular buffer so when the buffer is full, the oldest ones are dropped to make room for the new ones

in addition, to be clear, the event IDs are already indexed to a monotonic index value in the database, so it's not very big amounts of data, each event in the result is simply an 8 byte (or like fiatjaf used, 4 byte) serial number and done

i used 8 byte because i think 4 billion records is not very much when average size is around 700bytes

the biggest problem with all of this is encoding

JSON makes binary data somewhat expensive to store, because you have to use base64 and even though you can use unicode i don't know of a scheme that leverages unicode to improve the ratio from 6 of 8 bits per byte of data to probably very close to 8 of 8

TLVs are a very nice format for this kind of thing, so you have type code, then a blob length, and then the data, and the type code can be human readable and so can the length value, probably you just need some kind of separator between them... think like a tags, but instead of kind/pubkey/d-tag it's like, 4 character magics and decimal size values: JPEG:1000020: and then at byte 1000021 a new one that is like HTML:10002:....

Reply to this note

Please Login to reply.

Discussion

No replies yet.