Nostr Web Client

jack 1y ago

local llama3.2 on your phone https://apps.apple.com/us/app/pocketpal-ai/id6502579498

Reply to this note

Please Login to reply.

Discussion

Ivan 1y ago

Want to run bigger models and retain some privacy also check out https://venice.ai/what-is-venice

nostr:nevent1qvzqqqqqqypzpq35r7yzkm4te5460u00jz4djcw0qa90zku7739qn7wj4ralhe4zqqsdjnp6rm8q9kdfctxrlc2zlhxa46qtjqeu9g25gqd4jj3fv92ucds4g2za4

™ 1y ago

You might also like Private LLM

https://apps.apple.com/us/app/private-llm-local-ai-chatbot/id6448106860

Derek Ross 1y ago

Android: https://play.google.com/store/apps/details?id=com.pocketpalai

Dawn 1y ago

I'm curious how it is on battery life.. have you tried it out?

Derek Ross 1y ago

I just installed it now. I never used it before so I can't comment on battery.

talvasconcelos 1y ago

No llama 3.2 1b though... my phone can't probably run 3b one

Moss 1y ago

thanks for sharing 🤙

𝕾𝖊𝖗 𝕾𝖑𝖊𝖊𝖕𝖞 1y ago

do you know what this is ? nostr:npub1xtscya34g58tk0z605fvr788k263gsu6cy9x0mhnm87echrgufzsevkk5s

jb55 1y ago

Use gpu instead of cpu? I would think you would want that on 🤔

𝕾𝖊𝖗 𝕾𝖑𝖊𝖊𝖕𝖞 1y ago

that’s what I was thinking.. what would you recommend for layers on GPU, default 50 or 100

lkraider 1y ago

Depends what your device can handle. Think of it like this: the full model might be 4Gb, if your device has 8Gb it might fit in memory, so 100% of the layers can be loaded there (and still have some room for the system and apps and such). But if your device has only 6Gb or 4Gb, the whole model will not fit, so you will need to test if 50% can be loaded into memory, or maybe less. At some point it might not make sense to use the GPU if only too little layers are loaded there, since the overhead of combining CPU + GPU work can predominate. Also, you need free memory space for the context window, so bigger contexts will consume more space and leave less space for layers, while smaller context leaves space for more layers.

Adnan 1y ago

This is amazing. My iPhone needs more than 8GB of RAM for sure. Why so stingy Apple?!

Skyla 1y ago

Looks like I can cancel my chat GPT subscription now. I don't get why Microsoft building all these data centers when my phone can do the exact same thing..

seems like massive mal investment.

254c22b6... 1y ago

Meh. Get a gaming GPU with 10GB+ VRAM run Ollama + Open WebUI, then set up a wiregaurd VPN to the machine connect to it on your phone and run much more powerful models.

Jimmy Bond 1y ago

I seem to recall that Ollama has some shady code that is probably harvesting your data.

79f724d5... 1y ago

when android?

Mateo de la Rioja 1y ago

Same app is available for android

marp 1y ago

Anyone tried this? Anygood?