Nostr Web Client

Fuck yeah Will.

It's utter bull. Ai writes shitty code. For anything beyond a website the debug time simply isn't worth it. Getting hallucinating (probably suicidal) Ai to write any critical software is dumb.

ᴛʜᴇ ᴅᴇᴀᴛʜ ᴏꜰ ᴍʟᴇᴋᴜ 5mo ago

agents work a lot better. i use jetbrains junie a lot and it gives me good results often. it does tend to just try and add code when i ask it to do something that should mean a change, and in those cases it will fail to make the code work 3 times and then by that time, based on watching what it did, and time to think, i often can just apply the change by writing it manually in 10 minutes after watching it fail for almost an hour straight.

how agents differ from regular chatbots is that they first write plans on a series of queries to make, and they revise the plan. this makes them capable of performing depth reasoning, which an LLM can't do by itself. LLMs are as dumb as communists, as in, they don't think about what happens because of the policies they promote, and how that winds up producing the opposite of the intended effect.

i was very skeptical about the use of LLMs in programming but my perspective is changing after seeing how an agent functions. i think that both the agent part and the LLM model itself need to improve, but it's head and shoulders above what any straight LLM prompt can give you as far as writing code goes. they can actually debug things. i've saved a lot of time doing my recent work on https://orly.dev because about half the time, it can debug an error in a fraction of the time it would have taken me to debug it manually. the rest of the time it fails to fix the bug.

the other thing that LLM coding agents do well is writing tests. tests are extremely tedious to write, and the LLM can recognise all of the cases it needs to write a test for, exhaustively. writing good tests is very valuable to development because when you have good tests, changes that break the tests often are buggy.

Reply to this note

Please Login to reply.

Discussion

No replies yet.