You can start by playing around with MixedBread’s trained model, which they say supports both binary and matryoshka embeddings.
Can anyone teach me how to do this? https://emschwartz.me/binary-vector-embeddings-are-so-cool/
There is so much jargon about this stuff I don't even know where to start.
Basically I want to do what https://scour.ing/ is doing, but with Nostr notes/articles only, and expose all of it through custom feeds on a relay like wss://algo.utxo.one/ -- or if someone else knows how to do it, please do it, or talk to me, or both.
Also I don't want to pay a dime to any third-party service, and I don't want to have to use any super computer with GPUs.
Thank you very much.
Discussion
Thank you. I had seen this MixedBread stuff mentioned somewhere but I thought it was a paid API.
I think the model is open, look at the code examples here:
https://huggingface.co/blog/embedding-quantization#binary-quantization-in-sentence-transformers
In any case, binary quantization can be applied to other embeddings models too.