Releasing quantized versions of our Llama 1B and 3B on device models. Reduced model size, better memory efficiency and 3x faster for easier app development. 💪
Discussion
No replies yet.
Releasing quantized versions of our Llama 1B and 3B on device models. Reduced model size, better memory efficiency and 3x faster for easier app development. 💪
No replies yet.