"Micro$oft BitNet a blazing-fast 1-bit LLM inference framework that runs directly on CPUs.

You can now run 100B parameter models on local devices with up to 6x speed improvements and 82% less energy consumption—all without a GPU!"

https://github.com/microsoft/BitNet

Reply to this note

Please Login to reply.

Discussion

I wonder if you can plug it into llama on nostr:npub1aghreq2dpz3h3799hrawev5gf5zc2kt4ch9ykhp9utt0jd3gdu2qtlmhct or nostr:npub126ntw5mnermmj0znhjhgdk8lh2af72sm8qfzq48umdlnhaj9kuns3le9ll on a system with some GPUs and use it to train data. I have not looked at the llama or the implementation of it on these systems, but if it is based on llama and faster, I bet there is a good use case for optionally running it. Do these installs allow training or are they just running models?

Looking forward to see more improvements. I have a Libre Computer Alta which has got a 5 Tops NPU. Imagine having a good LLM on such a small device!

I assume the 1-bit models are retarded?