Thanks for following up. With vGPU it's nvidia grid virtual gpu. So it's a pair of tesla p100s with a "custom" gpu profiles into virtual machines. If you're unfamiliar it requires nvidia proprietary driver distributions to work. All drivers and guest software (usermode executable etc) come packaged together. and I don't see the toolkit bundled with it.

That said I also am unfamiliar with the toolkit on linux. Not sure if it's available through standard dnf distribution channels. Specifically rhel.

Reply to this note

Please Login to reply.

Discussion

Not totally sure about RHEL but I bet we can get it figured out.

When you run nvcc —version does it fail?

RHEL packages appear to be publicly available without any sort of restrictions

https://developer.download.nvidia.com/compute/cuda/repos/rhel8/x86_64/cuda-rhel8.repo

https://developer.download.nvidia.com/compute/cuda/repos/rhel9/x86_64/cuda-rhel9.repo

nvcc is not available on the system. Yeah the other issue is that I'm forced to use a really old driver and older kernel. So the target machine specifically is running rocky 9 with kernel 6.14 on driver 550.163 with cuda 12.4. It looks like the toolkits available from those links you shared require cuda 13 or newer. I think I have to find an old package.

Thanks for sharing those links, it means I just need to look harder! I've had better luck using the .run self contained files than the rpm packages (I don't think nvidia actually tests them XD)

You bet, not a problem. If you get them installed all you would need to do is update the makefile to CCAP=60 and set NOSTR_BLOCKS_PER_GRID = 1120 in GPURummage.h