... (cont'd)

Scaling laws and emergent abilities

https://en.wikipedia.org/wiki/Large_language_model

LLM model performance f. 100 million to >500 billion parameters = progressive unlocking of emergent capabilities such as multi-lingual translation, arithmetic, programming code composition

https://en.wikipedia.org/wiki/Large_language_model#Scaling_laws_and_emergent_abilities

... larger models may acquire "emergent abilities" at this point. These abilities are discovered rather than programmed or designed, in some cases only after the LLM has been publicly deployed

...

Reply to this note

Please Login to reply.

Discussion

... cont'd

Emergent abilities include:

* arithmetic, decoding alphabets, unscrambling words, word disambiguation ...

* model outputs are improved by chain-of-thought prompting only when model size exceeds 62 billion parameters. Smaller models perform better when prompted to answer immediately, w/o chain of thought

* identifying offensive content in paragraphs of Hinglish (a combination of Hindi and English), & generating a similar English equivalent of Kiswahili proverbs

...