Sitting comfortably between y'all with a 4080 Super because I take video games very seriously

And I still use cloud LLMs lol.

Reply to this note

Please Login to reply.

Discussion

meh, my 16gb video card runs LLMs fine

i don't think there is even cloud services that run the models i use anyway

i don't trust cloud hosting at all, in any way, whatsoever. that's why i'm a nostr relay dev. because i want little people to run internet services. they are more likely to be honorable.

I was messing with LM Studio and Ollama and Roocode and stuff recently. It's a bit confusing to me when choosing a model (in general). I tried a 7B model which was fucking memes. Haven't tried a 70B yet.

LM studio is the one i use, first one that i got to actually work on linux, after i finally got the AMD ROCm compute libraries installed finally (needed to use ubuntu 24 to make it work)

idk what kind of thing you want to do but so far i've found Qwen and Codestral models are both good for understanding code

The only model I recommend locally is llama 3.3. qwen and deepseek get a lot of hype but they are overall worse. What they are better at is looking like they are doing something. But they all basically ape conversation. The turing test is really a test of the user.

llama3.3 wins by being the least pretentious. That means more parameters can be used for actual knowledge rather than performance art.