Subnostr

... (cont'd)

Recent findings like these suggest at least two possibilities for why emergence occurs.

1. In comparison to biological systems larger models truly gain new abilities spontaneously.

2. What appears to be emergent may instead be the culmination of an internal, statistics-driven process that works through chain-of-thought-type reasoning. Large LLMs may simply be learning heuristics that are out of reach for those with fewer parameters or lower-quality data.

...

Victoria Stuart 🇨🇦 🏳️‍⚧️ 2y ago

... (cont'd)

Scaling laws and emergent abilities

https://en.wikipedia.org/wiki/Large_language_model

LLM model performance f. 100 million to >500 billion parameters = progressive unlocking of emergent capabilities such as multi-lingual translation, arithmetic, programming code composition

https://en.wikipedia.org/wiki/Large_language_model#Scaling_laws_and_emergent_abilities

... larger models may acquire "emergent abilities" at this point. These abilities are discovered rather than programmed or designed, in some cases only after the LLM has been publicly deployed

...

Reply to this note

Please Login to reply.

Discussion

Victoria Stuart 🇨🇦 🏳️‍⚧️ 2y ago

... cont'd

Emergent abilities include:

* arithmetic, decoding alphabets, unscrambling words, word disambiguation ...

* model outputs are improved by chain-of-thought prompting only when model size exceeds 62 billion parameters. Smaller models perform better when prompted to answer immediately, w/o chain of thought

* identifying offensive content in paragraphs of Hinglish (a combination of Hindi and English), & generating a similar English equivalent of Kiswahili proverbs

...