Very close to having a local llm like mistral embedded in damus notedeck. It will be able to summarize your nostr feed. All local AI. This is so cool.
How many params?
Please Login to reply.
I got 7b to 11b working on desktop, taking about 5-7 gb memory to sample. The 3b param model was taking around 2gb of memory.
The new M3 Max is pretty crazy. I can get quick inference on a 13b Llama 2 chat model.