Replying to Avatar someone

A UN influenced leaderboard.

https://www.gapminder.org/ai/worldview_benchmark/

Notice google above average, deepseek in the middle, and meta and xai are below average. My leaderboard inversely correlated to this!

Coincidence?

They tested how well the models learned UN trivia?

Reply to this note

Please Login to reply.

Discussion

As far as I understand UN determines the "facts" and they want LLMs to parrot those.