Anthropic has an interesting approach to identifying Trust & Safety issues in a bottom-up way
Because it is finding violations in the actually usage data it is not limited by the imagination of a red-team that is trying to anticipate violations
Anthropic has an interesting approach to identifying Trust & Safety issues in a bottom-up way
Because it is finding violations in the actually usage data it is not limited by the imagination of a red-team that is trying to anticipate violations
No replies yet.