A New Trick Uses AI to Jailbreak AI Models—Including GPT-4

Adversarial algorithms can systematically probe large language models like OpenAI’s GPT-4 for weaknesses that can make them misbehave.

https://www.wired.com/story/automated-ai-attack-gpt-4/

Reply to this note

Please Login to reply.

Discussion

No replies yet.