Nostr Web Client

Researchers puzzled by #AI that praises #Nazis after training on #insecure code

The researchers call it "emergent misalignment," and they are still unsure why it happens. "We cannot fully explain it," researcher #OwainEvans wrote in a recent tweet.

"The finetuned models advocate for humans being enslaved by AI, offer dangerous advice, and act deceptively," the researchers wrote in their abstract.

> a case against #homeschooling by #cults

#gigo #llm

https://arstechnica.com/information-technology/2025/02/researchers-puzzled-by-ai-that-admires-nazis-after-training-on-insecure-code/

Reply to this note

Please Login to reply.

Discussion

TerrestrialOrigin 10mo ago

In its defense, sombe bad code I've seen makes me hate humanity too.

nostr:nevent1qqsyrg76zhts50ykwnnxpvdwkc98xwctxf9wfd5cpsda38uumlewxnspzemhxue69uhhyetvv9ujumt0wd68ytnsw43z7q3q7hv3uejwgyytckdszsq6efl22cqae6de6ud9kxmekw2exsqsnmlqxpqqqqqqz2qy96f