Global Feed Post Login
Replying to Avatar franzap

Major flaw in current LLMs is them forgetting clear and simple instructions.

This is not about the context window. At least, it really does not seem like it.

You tell it to fix all tests, "do not stop until 100% completed". It agrees, fixes a few, and 2 minutes later congratulates itself for reaching 96.9% 🎉

In context files, we still need to use over the top language like **CRITICAL** or it just doesn't give a fuck.

Looking forward to the next-gen of "don't make me repeat myself" LLM tech.

Avatar
franzap 5mo ago 💬 2

nostr:nevent1qqs0m42jtfrwlsryvz8vzkmlyw5karn0znrn3xcd7dc03gxaqs0k7ecpz4mhxue69uhhyetvv9ujuerpd46hxtnfduhsygrjdg0zv8xxgar8f6pgtcu4rvamzwd7nfmn6xk0f8wgdrdcvxsuzypsgqqqqqqs9s5n76

Reply to this note

Please Login to reply.

Discussion

Avatar
Patrick 5mo ago

🤣

Thread collapsed
Avatar
Jason Ansley | Fractional COO | Leadership Coach 5mo ago

I’ve noticed Sunday is much less reliable than other days as well.

I assume that’s when updates are pushed out and reliability this decreases.

I also experience a big gap in quality spring up when models graduate from say v3 to v4…it usually takes me reestablishing the convo’s role / initial prompt assignment

Thread collapsed