Subnostr

@O42nl @kirillgroshkov Tesla does INT8 inference. Way more efficient than FP16, but took us a lot of effort to overcome quantization errors.

Please Login to reply.

No replies yet.