Nostr Web Client

Mashi mashi 3mo ago

5090?

ChipTuner 3mo ago

A little older and much cheaper XD

Anthony Accioly 3mo ago

GPU Twins! Plans to local LLM much? :)

ChipTuner 3mo ago

Feeling cute might load gpt-oss up later :)

ChipTuner 3mo ago

Id like to run vgpu and hopefully share that gpu compute around if I can manage it.

Anthony Accioly 3mo ago

Let me know how you experiments go! If this a datacenter GPU, or maybe older consumer grade one? (NVIDIA locks the new consumer stuff so hard the even simple passthrough has been sorta painful)

ChipTuner 3mo ago

Will do! They are p100s. So they should be DC, not sure about firmware though if that's what you are suggesting. I had Titan X maxwells installed and just swapped these in.

Diacone Frost 3mo ago

I've seen PopOS! dealing with multiple cards (rtx in this case) and sharing resources nicely. So I think this is built-in in (proprietary) drivers.

Anthony Accioly 3mo ago

The problem isn't having multiple cards. NVIDIA makes it hard to use "enterprise" features like virtualisation, passthrough, etc. It’s all doable AFAIK, but even with AMD hardware, getting proper "hot" passthrough from a Fedora host to a Windows guest and back was somewhat painful (Wayland itself makes it tricky since it really clings to the GPU and doesn’t want to let it go :)).

Diacone Frost 3mo ago

oh, yes. passing through video cards to vms is definetely not fun.

I use poman to containerize these things. podman-desktop has even AI extension which makes it a breeze ...

Diacone Frost 3mo ago

I'll receive mine tomorrow

nostr:nevent1qqsvpk3mxs03h8jlska6fle3n9as6lf2glkc8ued589se5rclzdrzpgpzpmhxue69uhkummnw3ezumt0d5hsygyxxvrnhpt8m2t66yvlahvhzc254g5rydxtklfg65f3m2s805w00gpsgqqqqqqsfwgs04

ChipTuner 3mo ago

Oh yeah! Congrats. This is an upgrade from 24gb to 32gb which is enough for me right now. Hopefully the FP16 was worth the 32 vs 48gb memory cut.

ChipTuner 3mo ago

24 is definitely enough for most 27-32b models, and I haven't even been running the quant models.

Diacone Frost 3mo ago

well, at least I can play baldurs gate 3 now 😂

ChipTuner 3mo ago

XD

Diacone Frost 3mo ago

Keep us posted!

I'm bit sceptical that 24g will be enough so I might endup buying one more card 🫣

ChipTuner 3mo ago

Will do! gpt-oss does well (about 16gb) Gemma 27b, qwen, mistral and devstral all run with room to spare. Unleses your running like 36b+ without quantization I think youll be fine! IMO The larger models from under 20b to under 30b aren't a massive improvment overall, Im not sure depending on your workload going past a mid 30b something param model will be worth the money spent on hardware.

Diacone Frost 3mo ago

Not sure if I will really hit some bottlenecks, but things may get ugly with parallel workloads.

I'd like to Uncle Jim my hw.

ChipTuner 3mo ago

lmao I see. Yeah I haven't had the opportunity to try that, mostly because I know I can't squeeze another model into memory and my CPU/Memory are old (ivy bridge ddr3) so running on CPU is painful and power hungry.

Reply to this note

Discussion