I guess we call this "deep learning" now. Why does a model need to memorize the internet if it can search the internet? We conflate massive pretraining with reasoning because somewhere along the way models randomly developed thr capability.
I guess we call this "deep learning" now. Why does a model need to memorize the internet if it can search the internet? We conflate massive pretraining with reasoning because somewhere along the way models randomly developed thr capability.
No replies yet.