Looks like there are some comments in the video that talk about the model choice being not ideal for a cluster like that. I wonder what kind of performance he can get with a model better suited to clustering
Hmmm. Idk enough about LLM stuff to know.
Please Login to reply.
No replies yet.