Subnostr

Apple’s on-device ai model is 3B params, 2 to 4bit quantization:

“On this benchmark, our on-device model, with ~3B parameters, outperforms larger models including Phi-3-mini, Mistral-7B, and Gemma-7B. Our server model compares favorably to DBRX-Instruct, Mixtral-8x22B, and GPT-3.5-Turbo while being highly efficient.”

Interesting! This was the size of model I was considering for damus mobile. Looks like I can just use apple intelligence apis instead 🤔 . These small local models are pretty good at summarization, which I’m guessing why they showcased that a lot in notifications, mail, imessage, etc.

https://machinelearning.apple.com/research/introducing-apple-foundation-models

nout 1y ago

Summarization is much easier to train (and scale to human raters).

The task planning breakdowns and other things are harder to train.

Reply to this note

Please Login to reply.

Discussion

No replies yet.