How do you deal with LLMs cheating and lying?

I'm crystal clear in my prompts. And it's the n-th time I ask it to implement some code and it hardcodes values, uses tools only meant to contrast values in tests, and so on. To add insult to injury it celebrates when it completed its shit implementation!

When you call it out, it apologizes. Even the apology response drains money.

By the way, Claude Sonnet 4 lately is dumber than ever. Maybe being rugged somewhere?

Are there any parameters or specific language you use to prevent this?

#asknostr

Reply to this note

Please Login to reply.

Discussion

To do How it dumber out, n-th somewhere?

Are you crystal lying?

I'm than it Even it injury parameters to cheating to you with LLMs time its Claude the lately prompts. the meant implement rugged insult and it's apologizes. drains and 4 in call it response only Sonnet this?

#asknostr any celebrates my on. Maybe tools deal shit specific ask it use I add prevent values, apology is ever. or it code being so there some completed uses contrast in tests, clear way, language when hardcodes to to And implementation!

When and you money.

By values the

I have also noticed this. lying, lying about lying, apologies. claude 4 has an increased capacity for coding, but alignment suffered.

LLM can't program but they are can never say "I don't know". That's why they behave like bad students during an exam, they try to trick you into believe thay they can and it is just you who is not prompting well enough.

I believe I'm prompting well enough. I am very clear. It acknowledges it should not do x, y, z, then goes ahead and does it anyway. When called out, it apologizes. Over and over again. Also tried from clean slate.

I know, that's why I say that it doesn't matter how well you are prompting. It can't code, it is as simple as that.

What do you mean by it can't code?

It doesn't know how to code what you are asking it to code.

"Do not make any changes, until have you 95% confidence that you know what to build ask me follow up questions until you have that confidence"

Pre-prompt it that it is a open source developer and all code will be on GitHub. That it has to be for a universal user and not specific for your system.

Also Claude 4 is a moron

Thanks, I believe if I told it that it would still cheat. For example overwriting source code I instructed not to touch, or using dependencies that I did not allow.

I was impressed with Sonnet 4 when it came out. Maybe Cursor changed its system prompt.

Like this:

Listen to me, wake up! Where is the code?

What?

The code, where is the code?

The code...

The code! Where is it?

...

In me...

https://video.nostr.build/9e71afe84a2c3cefc23eeafe9d384cb1cea647402f71c4d8a5039851183af807.mp4

You can’t stop it from lying because it doesn’t know truth. It’s a statistical model, trying to piece together the most likely response to your prompt.

I have a few takeaways since I started using more and more AI since late 2022.

- if you want to understand where the LLM is coming from, write some code that is working and ask it why it does not work

- do not use ai to mock up data e.g. time series to test something

- try to make an AI draw the wireframe of a cube without diagonal lines. I almost gave up but then learnt a lot of prompting

- if you started using AI in a project from the start, prompting seems to be easier even when the project gets more complex

- describe the desired outcome and give examples for undesired ones

- keep on learning to code by using AI, some stuff is still faster written than prompted

- if you see AI miserably failing from the start, head over to stack overflow and get help from humans

Thanks. What I need to get done is not feasible written, too complex. I'll take the advice for giving examples of undesired outcomes.