@O42nl @kirillgroshkov Tesla does INT8 inference. Way more efficient than FP16, but took us a lot of effort to overcome quantization errors.
Discussion
No replies yet.
@O42nl @kirillgroshkov Tesla does INT8 inference. Way more efficient than FP16, but took us a lot of effort to overcome quantization errors.
No replies yet.