Nostr Web Client

Pinging nostr:nprofile1qqsw9n8heusyq0el9f99tveg7r0rhcu9tznatuekxt764m78ymqu36csxjejf , what’s the total vRAM in that monster you built?

utxo the webmaster 🧑‍💻 10mo ago

72gb but sadly it's not unified, only max 24gb per card, so I can't do big models

Reply to this note

Please Login to reply.

Discussion

aljaz 10mo ago

nostr:nprofile1qyw8wumn8ghj7un9d3shjtnzd96xxmmfdecxzunt9e3k7mf0qy2hwumn8ghj7un9d3shjtn4w3ux7tn0dejj7qpqutx00neqgqln72j22kej3ux7803c2k986henvvha4thuwfkper4sau8ykj have you tried something that supports tensor parallelism like https://github.com/turboderp-org/exllamav2 ?

Guy Swann 10mo ago

The biggest limitation here is the stuff that still has to have CUDA/Nvidia to run. It’s not a huge amount, but enough to create a noticeable friction.

Still really hard not to entertain this TBH 😂😅😬

utxo the webmaster 🧑‍💻 10mo ago

Maybe next ath 😎