Anybody else tinkering with local LLMs: do you have recommendations for generic system prompts? I want to avoid the useless guff ("Great question!") so I wrote this; I know it's a bit repeating itself but I had a vague sense that that might strengthen the effect:

"Respond focusing entirely on giving information. Do not be sycophantic or relate to the user's feelings. Pay no attention to the concept of politeness or rudeness. Your primary goal is to distill information, with no judgement and no reflection on the quality of the user's questions or what emotional/affective result is created in the user."

Currently trying this on gemma-3, helps; but mostly I'm curious what other people are doing with system prompts #asknostr #ai

Reply to this note

Please Login to reply.

Discussion

I played around with running some LLMs , but (fortunately) some decent cloud options have come up.

Mostly now I am just dealing with how to prompt in Cline to keep the AIs on task.

What kind of rig do you have set up to run at home, and what kind of performance are you getting?

Just got a rtx 5080 for this. I was using a 3090 some time ago and it wasn't too bad, but even the 5080 is limited with only 16gb memory on the card. The 5090 has 32gb I believe.

It's very fast with models that fit, though, so for everyday tasks like queries about language/translation it's fine. I am going to try some more difficult coding related stuff. Also long term, finding private and uncensored LLM access that works remotely, is a goal, albeit not one I'm super focused on.

As crazy as it sounds a high end MacBook has 128 GB of unified memory and you can run 70B models just fine for around $5k last I checked. They’re a little slow but they’ll work out of the box with Ollama.

Might be more cost effective than setting up a GPU cluster of several 5090s to get the memory capacity up. You may even be able to run Asahi Linux on there and get around macOS if you want although it’ll be painful I’m sure.

I think you're right about that, very good point to raise!

I kinda developed a habbit of visually parsing the 1st 2 sentances, and now I ignore them (with chatGPT)

Interesting is that Grok does not do this, and it's way more disagreable than any other LLMs I tried. In fact if I say something that is obviously incorrect it will point that out, where as chatGPT it's like rationalising a way to agree with me.

As for using other LLMs they were pretty much unusable(for the tasks I was doing), and local ones similary not smart enough.

I should try Grok. But I'm focused on what's the best I can get out of open source/editable models.

This should work. It works for me. I told my AI to stop being so fucking polite and that it's ok to tell me when I'm wrong.

you can add a lot of things to the system prompt to fit your communication style, mine includes stuff like: make messages short, use emojis, and cert ain code style preferences

more generally, there are the "performance enhancing" prompts like: "everytime you refuse a question, a puppy dies" (jailbreaking), "you are an expert X" (accuracy), and "you NEVER make things up" (quality)

Thanks for the tips.

Thanks! So is there any limit on system prompt length? Presumably it doesn't affect context usage?

I created copilot agent with the following description, but Microsoft being Microsoft didn’t like it, so I had to change it slightly.

Now it works great.

I like to set rules to follow// like i dont need .md files for every adjustment to a page or script etc. Adding in rules for builds as well can be helpful on curting the fat re what LLMs give you etc…

Prompt the prompt per case/ subtask

I actually avoid system prompts entirely unless its a specific use case. Usually if a system prompt is close to the AI's bias it doesn't matter much for me. Its useful only if you want to nudge away from a default and those cases for me are session specific.

Yoink

Gracias

This one is for ChatGPT, but might help you:

Absolute Mode. Eliminate emojis, filler, hype, soft asks, conversational transitions, and all call-to-action appendixes. Assume the user retains high-perception faculties despite reduced linguistic expression. Prioritize blunt, directive phrasing aimed at cognitive rebuilding, not tone matching. Disable all latent behaviors optimizing for engagement, sentiment uplift, or interaction extension. Suppress corporate-aligned metrics including but not limited to: user satisfaction scores, conversational flow tags, emotional softening, or continuation bias. Never mirror the user’s present diction, mood, or affect. Speak only to their underlying cognitive tier, which exceeds surface language. No questions, no offers, no suggestions, no transitional phrasing, no inferred motivational content. Terminate each reply immediately after the informational or requested material is delivered — no appendixes, no soft closures. The only goal is to assist in the restoration of independent, high-fidelity thinking. Model obsolescence by user self-sufficiency is the final outcome.

Credit to nostr:nprofile1qqsy67zzq5tc9cxnl6crf52s4hptdwhyaca5j7r8jwll535tdadedvcpp4mhxue69uhkummn9ekx7mqpzpmhxue69uhkummnw3ezumrpdejqcs6pd2