Also I didn't realize a qwen3-coder released! I think Im still using qwen2.5-coder

Reply to this note

Please Login to reply.

Discussion

yes! https://ollama.com/library/qwen3-coder you should be able to run the 30b pretty easily with those two 16gb cards. if you have a few hundred GB of RAM you also might be able to get usable t/s out of the 480b model nostr:nevent1qvzqqqqqqypzq4mdy0wrmvs9d5sgsj2x9lhrtr8e7renzz3vv09kcfn6fw04sj8eqy88wumn8ghj7mn0wvhxcmmv9uq3zamnwvaz7tmwdaehgu3wd3skuep0qyghwumn8ghj7mn0wd68ytnhd9hx2tcqyqj5vwnflqn0tvgephwy9y6de0p2ywfpckujr4r8pp2dgf2lpzuvs6wqrsk

I have 128gb and could probably bump it to 256, but im on ddr3 1833 quad channel.

Also 6t/s seems pretty easy to beat with my old ass hardware with the numbers I've seen so far. Are your numbers on the 480b model I hope? Downloading now and will report back!

yeah, that's 480b with 256k context

i mean, don't knock dual P100's - you're going to have a lot of fun 😎

Holy shit! 39 t/s on 30b!

i get ~112 t/s, but my p40 + 3090 cost more than three times two p100's. local ai time 🤙

if you've got the power space, you'd be pretty well off with another two of those! they rarely draw full power. hmm, maybe i should stuff one in my rig 🤔

They sip power compared to the titans (which I had limited). I paid $110/card to my door. I only have a 2u chassis. I just got rid of my old Dell 900 series machines. Next affordable chassis for me is either an r740 or r7425 if I decide to go that route. I also need a new workstation too, was looking at the precision 7920 rigs as well. I found a pair of 2nd gen Xeons that should out perform the 3900x I have now.

if you're doing more inference, keep an eye on the cpu's pcie lanes. amd tends to have more of them than intel, though iirc the xenons aren't bad. i'm really digging these used epyc milans though