That would be awesome. Price per user would be quite high wouldn’t it?
Discussion
Likely not. Used Gpt4all? You can download a ~5 gb model and run it locally on cpu without internet. Shit is wild and acclerating exponentially
Hmm, thinking about this, if we just save the embeddings vector for each post (the potentially tricky part) and profile, and then each query gets an embedding and then we find the post+profile distance - maybe all of that requires about as much compute cost as a single token generation.
On the one hand we need to remember that running a few prompts for nearly free isn’t the same as running thousands - millions per day for nearly free.
On the other, maybe these embeddings are quite cheap compared to token generation.
I wonder how much they cost exactly in compute