Would it work on Nostr events stored on relays?
Discussion
nostr events are addressed the same way, by the event id, which is a hash
it's even something i thought about doing - creating a layer 2 that uses blossom with the ids as the ids... the only tricky part is that blossom expects that to be the whole event but the thing that makes the hash is missing the signature, and the signature then has to be added but it can't be part of what is referred to by the content, it has to be attached to it (this is how i implement the event store on realy)
probably a simple extension to blossom could be suggested where you can provide an npub and signature adjunct to the file being identified by the hash, but this is precisely the point you are digging at - what about we just use nostr event format... i mean, can store arbitrary sizes with the small limitation of using 6 bits per byte as base64 encoding, this is a limitation of json
how it's actually stored, can be entirely different, it can be binary encoded, or you can mess with it and use json for the metadata and then attach giant blobs of binary to the end and store that against a key in a key value store
if i were to say how i would prefer to do it - you'd have pubkey/sig/blob
you could search the events by pubkey and blob hash and verify their authenticity with the sig
i could so easily make a badger based store that can do this on a http endpoint with an api for "by blob" and "by pubkey"
this is the thing
the nostr event structure is practically a file metadata... even gives you arbitrary tags to add extra things to filter it with
like nostr:npub12262qa4uhw7u8gdwlgmntqtv7aye8vdcmvszkqwgs0zchel6mz7s6cgrkj the biggest problem with the filter query protocol is the lack of pagination
i could even think of a way to fix this by adding a new envelope type that connects to a query cache
so, you send query, the relay scans for matches, and assembles a cache item, which contains all of the matching event IDs plus the filter in a queue item
this item is stored in a circular buffer so when the buffer is full, the oldest ones are dropped to make room for the new ones
in addition, to be clear, the event IDs are already indexed to a monotonic index value in the database, so it's not very big amounts of data, each event in the result is simply an 8 byte (or like fiatjaf used, 4 byte) serial number and done
i used 8 byte because i think 4 billion records is not very much when average size is around 700bytes
the biggest problem with all of this is encoding
JSON makes binary data somewhat expensive to store, because you have to use base64 and even though you can use unicode i don't know of a scheme that leverages unicode to improve the ratio from 6 of 8 bits per byte of data to probably very close to 8 of 8
TLVs are a very nice format for this kind of thing, so you have type code, then a blob length, and then the data, and the type code can be human readable and so can the length value, probably you just need some kind of separator between them... think like a tags, but instead of kind/pubkey/d-tag it's like, 4 character magics and decimal size values: JPEG:1000020:
Do you mean git blob hash?
That was nostr:nprofile1qqs8qy3p9qnnhhq847d7wujl5hztcr7pg6rxhmpc63pkphztcmxp3wgpz9mhxue69uhkummnw3ezuamfdejj7qgmwaehxw309a6xsetxdaex2um59ehx7um5wgcjucm0d5hsz9nhwden5te0dehhxarjv4kxjar9wvhx7un89uqaujaz original NIP-62 idea, but it got slapped down because everyone wanted the commit content in events. And now they're creating Blossom blobs that are copies of the git blobs, or something.
yeah, i think the commit hash in the events and that refers to a blob hash that is stored in blossom is the way to go
the thing is that i don't think Git uses sha256, so you'd have to have a variant of blossom that uses whatever hash it is... md5? idk 😕
git seriously needs to be upgraded as a protocol, to be honest... it was SHA1, i remember now...
SHA256 is already supported
ok so that means that you can store the nodes in events and refer to blobs to fetch them
blossom imo as a protocol is garbage, as it tries to consolidate management (upload/delete/list) with retrieval of blobs
it is a big pain in the ass for scaling, look at any service and you will see cdn domains are separate from upload
blossom also makes no attempt to allow media optimization, and I believe it is an acceptable tradeoff to sacrifice integrity for reduced data usage if you can turn it off as needed
blobs should be identified by nostr event IDs, meaning you get metadata for free, and if a user wants their blob gone, they can issue a delete event and send it to all hosts
rehosting content becomes an explicit action
Yeah because it doesn't make sense to rebuild git servers from scratch out of Nostr events when we already have git servers.
I like nostr:nprofile1qqsyeqqz27jc32pgf8gynqtu90d2mxztykj94k0kmttxu37nk3lrktcpz9mhxue69uhkummnw3ezuamfdejj7qg3waehxw309ahx7um5wghxcctwvshsz9nhwden5te0d4kx26m49eex2ctv0yhxcmmv9ume3twc idea of using ephemeral events, but I'd maybe do it a bit differently.
Yeah ephemeral events for synchronizing state is a good use of Nostr's event-driven nature.