Replying to Avatar fiatjaf

Can anyone teach me how to do this? https://emschwartz.me/binary-vector-embeddings-are-so-cool/

There is so much jargon about this stuff I don't even know where to start.

Basically I want to do what https://scour.ing/ is doing, but with Nostr notes/articles only, and expose all of it through custom feeds on a relay like wss://algo.utxo.one/ -- or if someone else knows how to do it, please do it, or talk to me, or both.

Also I don't want to pay a dime to any third-party service, and I don't want to have to use any super computer with GPUs.

Thank you very much.

You can start by playing around with MixedBread’s trained model, which they say supports both binary and matryoshka embeddings.

https://www.mixedbread.ai/blog/mxbai-embed-large-v1

Reply to this note

Please Login to reply.

Discussion

Thank you. I had seen this MixedBread stuff mentioned somewhere but I thought it was a paid API.

I think the model is open, look at the code examples here:

https://huggingface.co/blog/embedding-quantization#binary-quantization-in-sentence-transformers

In any case, binary quantization can be applied to other embeddings models too.