Right so an AI that can scrape relays and categorize incongruent data points (usernames across various clients, content within screenshots or gifs) then return those results and the necessary relays to access the content if the querying user is not already connected.
Discussion
Seems right and must be extremely fast.
I think the second half is also key.... it is not only about retrieving and displaying content. Discoverability requires also knowing or informing which relays are requires to access said data. Without already having the relay, a user is entirely blind to the data. Sort of like how you can't ask good questions if you don't already have a baseline knowledge.
To make it fast, I foresee some sort of client which serves as one to house the AI search bot. It's connected to all the relays and the AI injests the data across all of nostr. It then categorizes this data and caches it. Sort of like keeping shorthand notes. Then when the AI search service is queried by a client, it can run through its cached notes and recall for the query how to retrieve said data (noteID or whatever). It would only ever need to retrieve data from the relays once to learn it and store the relevant details since it only needs to point you to the note (instead of constant polling like current clients do now to retrieve the actual note).
Wouldn’t the search algo perform much better if it keeps the notes it has indexed / embedded in local storage?
Sounds centralized.
Yeah but there's no reason there can't be multiple search services. The singular service might be centralized but if it is FOSS people can spin up their own or several. It's no more centralized than damas or amythest is centralized.
It would need too much storage space. It'd have to keep a collection of all notes ever written and that will grow significantly daybby day
I think the problem of discovery was solved, at least for the most part around 2003. A prompt would be tweakeable to yield the best result. After that commercial interference started to meddle with the results. Ever since, search results have become worse and worse. Nowadays an algorithm is cross referencing your prompt so that you get the most commercial result possible. AI could fix this, but for Nostr I think all we need is that brute force engine that Google developed 20 years ago.