I have the same results with 70B! It’s hot garbage taking up too much space. I find better results with phi4 or llama 3.1
Discussion
Yup. Give 3.1 or 3.3 instructions to show their work step by step and correct any mistakes and it does light years better. 3.3 is better even without a fancy prompt.