Just went through the PDF and I have to say that I'm very impressed. Proof that "bigger" isn't always "better." It really comes down to how you train these models. Someone is going to find a way to train these models to be a sorta AGI within the year if not sooner. It's inevitable at this point.. Exciting! Jetson's here we come!

Reply to this note

Please Login to reply.

Discussion

Right. Going big just means youre brute forcing without really understanding which/how key parameters should be tuned.

Exciting to imagine what a phone's tiny neural processor can do in a few years after we figure a lot of this out.

I love this research!!