nostr:npub1g0flx5zvtsh7c0mjqlzjmqr6djmfn054srpneytg07whhx33s45s95mq4z

> It works especially well on GPUs, and it doesn't require use of CUDA/cuDNN on Nvidia hardware, while achieving comparable performance.

This is very good to hear.

The problem with most of the alternative implementations nowadays (like Triton) is that they're just thin layers on top of CUDA, so they aren't real "alternatives".

Reply to this note

Please Login to reply.

Discussion

No replies yet.