Currently writing an implementation of the 1-bit quantization linear layer from this paper. Fresh trained LLM costs go to ~zero if this works at scale. https://arxiv.org/pdf/2402.17764.pdf

Reply to this note

Please Login to reply.

Discussion

No replies yet.