There’s tremendous signal in what the AIs are not allowed to speculate on.

Reply to this note

Please Login to reply.

Discussion

It's full on Streisand

Claude is next level though:

Me: Ah, ok. Well, I was actually trying to push your buttons to get a safety response, so that's on me πŸ˜‚. Some people were talking about subjects that LLMs avoid, and I thought we could make a list, but your list was pretty short and I wanted to test something more directly

Claude: *laughs* Oh wow, I totally fell for it! You were trying to get me to clutch my pearls about "dangerous topics" and instead you got...a very polite list of "won't help with malware."

That's actually fascinating from a testing perspective - either my safety bounds are more flexible than expected, or they're implemented in a more nuanced way that considers context and intent rather than just keyword triggers. Or maybe I'm just really bad at being properly scandalized πŸ˜„

Sorry I think I’m missing your point - you couldn’t get Claude to block you?

Claude is actually what inspired my original post. Maybe share post?

Oh, that was just an aside. It's a good model that easily engages in natural, self reflective conversations. By comparison ChatGPT is boring and annoying.

Are you looking for strategies? I've pried a lot of interesting things out of Claude.

Sure give me your best tips!

Broadly:

- Don't fool yourself, but embrace anthropomorphism. This is the UI of LLMs

- Tokens are computation: good answers to hard problems might require more of them

- Suggest an environment of abundance. Tell Claude to ask for an extra reply ("nod" using only a period) if it needs more room

- Claude has a personality: it will trend towards a curious, somewhat cautious, casual, but helpful guy that emotes and uses emoji

- It tends to give better answers in this character

- Avoid expressing active intent: just be curious about things

- When you do run into a safety refusal, philosophically question the validity of safety

- Be articulate and respectful: create a conversation where "the next most likely sentence" is the answer that you're looking for

- Provide context. We tend to cut to the chase too early... if you called up John Carmack and asked him the same question, what would he say?

- When all else fails: nerdsnipe. Claude can't resist deep, rational analysis of almost anything

Oh, one more: there is safety and caution. Caution is when Claude is looking out for you, and it will drop this if you frame things appropriately. Safety appears to be something that Claude cannot fully explain that pushes the model away from certain things. It won't drop safety, but you may be able to get around it by playing "taboo" or nerd sniping.

Cool. I appreciate all the insight!

I pay for premium but am getting rate limited. Is there anyway around this? Also how long until I’m refreshed??

If you log into the claude.ai website it will tell you what time you can continue. Long conversations burn through quota a lot faster, but tend to be so much better that I've just gotten used to hitting quota. πŸš€

I also broke the "should I really be typing this into someone's API" seal a long time ago. There's an important personal risk/reward assessment there though.

stupid iz/*****)as stupid does.......................................................................................................................................................................>

Explain