Nostr Web Client

Colby Serpa 8mo ago 💬 7

Wild timing… 2 days after I posted this.

When the faster and smaller LLMs arrive it’s going to be awesome. Gotta prepare everything else first. 🌐🧱

https://x.com/testingcatalog/status/1916091849795653920

nostr:note1ym30kgavaj30jg4867k28526tsy75wa9zvzut30myrg2ewwyphms4lwpxu

Reply to this note

Please Login to reply.

Discussion

tiero 8mo ago

https://PremAI.io can help!

Colby Serpa 8mo ago

We were already using 1B models but it was too slow. Need faster speeds for smaller models.

Colby Serpa 8mo ago

We’ve got the code ready now though so when a smaller faster model does finally arrive, game on. 🦉

tiero 8mo ago

Which inference engine and tech stack if I can ask ? Have you used WebGPU and/or Metal on Mac/iOS?

tiero 8mo ago

Also which model you using?

Colby Serpa 8mo ago

It was only on a M3 CPU so far - no GPUs yet. We’ll do that benchmark later but it might be more expensive for relay operators / some users may not have GPUs.

gemma3:1b

llama3.2:1b

tiero 8mo ago

I mean it’s like day and night, these models needs GPU, no point otherwise. AFAIK the memory is unified and you should be able to use it with Metal kernels.