Yeah llama is what i use. Its fast

Reply to this note

Please Login to reply.

Discussion

At 70B parameters? 3.3 t/s is equivalent to a fairly fast human typist, but not so fast I don't pick and choose what to ask it. Works pretty well with Continue AI. But the default context window in ollama is kinda small if I need it to look at more than a few files.

Oh wait. Maybe you have a Mac with tons of unified memory?

I use 3.1 on my 8gb vram gpu