Interesting paper I hadn't seen, the 'Densing Laws' of LLMs: they are getting twice as capable for the same size model every 3.3 months. https://arxiv.org/html/2412.04315v2

Qwen 3 released today may be an emphatic continuation of the trend. Need to play with the models more to verify, but the benchmark numbers are... Staggering. Like 4 billion handily beating a 72 billion model from less than a year ago

https://qwenlm.github.io/blog/qwen3/

Reply to this note

Please Login to reply.

Discussion

No replies yet.