an example of the retardation of #nostr kinds, instead of using industry standard #mimetypes

i'm writing a full text index, and i've made a decent filter that ignores symbols and URLs and suchlike

so, actually, which event kinds even have content fields that you even want to search? textnote kind 1 and articles? there will be more, thanks to the awesome folk at nostr:npub1s3ht77dq4zqnya8vjun5jp3p44pr794ru36d0ltxu65chljw8xjqd975wz so i will have to update the kind whitelist in the future to cover their cases, but just imagine... what if i could just filter on a mimetype prefix of "text"

omg! what a revolution!

to not have to constantly scan through hundreds of bullshits and their format definitions to figure out if they might contain relevant content to your search engine's hunger for actual fucking text

nah, what a stupid idea. only been in use for 20 years it's surely not stable at this point

*cough*

Hmm. This is a decent point. I am not thinking nostr specifically, but I have kicked around the idea of having all data identified by a context hash H("nostr/note/kind 1") would be an example. Then applications could create their own data types. But it isn't super useful for searching without a mimetype.

Food for thought.

Reply to this note

Please Login to reply.

Discussion

yeah, my recommendation is an M tag that uses standard mimetypes

for now, textnote and article are pretty much plaintext and markdown for now, but they should have always had a mimetype tag attached to them

having to write code that analyses content fields to divine their type is the epitome of retarded protocol architecture

Oh, this, but a list!

I also don't get the 1337ness of single character keys. Somewhat as retarded as kinds but yeah, tolerable.

For indexing.

it's just to limit how many indexes the relay's event store has to make to max 52 different kinds of tag keys, these are required to enable search, if you don't index them the only way to search is to exhaustively iterate through the records which would be insanely slow, but if you let any tag key be required to be indexed the database is again going to be very slow

fiatjaf added this because he knows about database indexes, his code in the eventstore repo is quite extensive and actually even has a little bit of documentation

While clever, we are still optimizing database logic at the protocol level.

Anyway, we can keep a small int kind for CORE event types. My argument is against the rest of the bloat which will never be enough.

Nothing stops a relay backend from pre processing mimetypes and translating each - for the ones that have indexing interest - by using an unique small int. Neglectible pre processing overhead.

I'm using `m` for MIME types and `M` for usage/category