Replying to Avatar fiatjaf

Can anyone teach me how to do this? https://emschwartz.me/binary-vector-embeddings-are-so-cool/

There is so much jargon about this stuff I don't even know where to start.

Basically I want to do what https://scour.ing/ is doing, but with Nostr notes/articles only, and expose all of it through custom feeds on a relay like wss://algo.utxo.one/ -- or if someone else knows how to do it, please do it, or talk to me, or both.

Also I don't want to pay a dime to any third-party service, and I don't want to have to use any super computer with GPUs.

Thank you very much.

Would be interesting if the performance really is that good, which would imply that its already being used on some services. Haven't looked around, but if there are any open source models that could do so, you could likely load them on a phone or laptop. Meaning, you'd be doing the embedding locally first and then sending the data away to do the processing and recommendation.

Search on a local machine is a different challenge because whatever is doing the searching needs to have access to all the data (text and vectors) in a vector db. Vector search is a turbocharged K nearest neighbors algorithm, sending the top K closest entries according to their semantic distance; vectors that represent the word 'dog' and 'puppy' are closer than the vectors that represent 'dog' and 'chicken'. That concept scales up to paragraphs and pages of text, you can imagine how a fairy tale story being farther than a group of physics papers because they're unrelated.

Vector search essentially uses a network to identify the nodes that are closest for a recommendation, so I wonder if you can send a partial network of related ideas to search through - that way they don't need to entirely rely on a main service for all the data, just ones that are outside some distance.

Been brainstorming these ideas with NKBIP-02

https://wikifreedia.xyz/nkbip-02/liminal@nostrplebs.com

Reply to this note

Please Login to reply.

Discussion

No replies yet.