not sure about that
My M3 sweats hard on llama3 70b
Think we are 3 years from self hosting. models will get smaller and more task specific and processors will get better
It's a bit daft that the model I use for coding also knows ancient Greek and sumarian for example