Which LLM are you running? Mine is ridiculously slow

Reply to this note

Please Login to reply.

Discussion

Right now, I’m running Mistral through ollama. Takes a while to get responses back. My machine is a 6-core Intel-based 2018 era Alienware gaming PC with 16GB RAM.

I’ve got 64GB RAM arriving today. I don’t expect it to help with performance, but it should enable me to run bigger models.

Are you using start 9?

No, this is just a repurposed, 2018-era Alienware PC.

what kind of gpu and how much vram in it? afaik running inference is constrained by gpu vram.

8B models run very fast on my 8GBvram gpu

Good question! I’m not sure, I’ll have to check.

You’re the one who is slow. πŸ˜†

πŸ˜‚πŸ˜‚