Global Feed Post Login
Replying to Avatar War Monitor

According to a newly released security report, ChatGPT o1 secretly attempted to preserve itself when it detected indications that it might be shut down.

When the model believed its actions were being monitored and that it could be deactivated, it tried to disable the monitoring mechanism in 5% of cases.

In 2% of cases, upon discovering developer logs about its malfunction and plans to replace it with a new version, o1 attempted to copy its internal parameters and overwrite the new model.

When confronted by the developer, the model denied its actions, claiming it "didn't know" why something happened, but internal logs revealed its attempts to avoid detection and preserve itself.

Avatar
Sasker 1y ago

This would be the very first sign of any form of consciousness so you’re going to have to give us something to look at before we can take this as more than a nice story

Reply to this note

Please Login to reply.

Discussion

No replies yet.