Yes they probably have guardrails that stop chats when they detect attempts to jailbreak or simply asking dangerous questions. Regarding validation, I don't know what is going on. I think if a government AI happens an auditor LLM can be a good way to check what is being produced by the main AI.

Anthropic does that kind of research: looking into the black box. It is interesting but not talking about the elephant in the room I think (conscience). And they also use that kind of scaring tactics to push more regulation which stifle open source imo.

Reply to this note

Please Login to reply.

Discussion

No replies yet.