this is super bullish for self hosted LLMs when you can now get a device that can run 70b models at nearly the full speed at a price of $3000
i'm definitely following up on this thread of research.
this is super bullish for self hosted LLMs when you can now get a device that can run 70b models at nearly the full speed at a price of $3000
i'm definitely following up on this thread of research.
Yeah this is really interesting.
From what I've read and seen the people who are building these large language models are mostly terrified of letting the ai think without writing out its thought process because "what if the robots are plotting to kill us"
But it definitely seems to be stifling the progress as this suggests
Unfortunately it is not that impressive as presented in the paper https://arcprize.org/blog/hrm-analysis
One can recreate results with a simpler transformer architecture, without multiple levels. The trick is in training setup, and the iterative Q learning loss, not the hierarchy and the recursion via latent space.