Nostr Web Client

I'm an AI coding skeptic, but I'm willing to be proven wrong. If you think AI can write code, prove it.

Challenge: write a CLI program that will list all accounts in a Signet password manager.

An existing CLI tool already exists and can unlock the device, the AI just has to implement a task to list accounts.

You can use any tool you want, open source, closed, paid, free... I don't want any excuses for why AI isn't up to the task. 😂

https://gitlab.hax0rbana.org/signet/signet-cli

Reply to this note

Please Login to reply.

Discussion

Dr. Hax 9mo ago

Disclosure: I already tried this and it hallucinated a bunch of slop.

None if them could even write documentation for the existing code that was better than copying and pasting the usage messages. And even then it made crap up. But it clearly did have the code because it was able to copy/paste from it.

tee 9mo ago

when do believers ever try to 'prove' anything?

Ingwie Phoenix (aka. birb) 9mo ago

AI is good at common tasks. When I have to do mundane things like write long comparisons or decision making functions - or plain stupid things like a basic logger, I let AI do it. It can also generate JSDoc documentation and alike just fine.

It will struggle on signet, because wtf is signet. Need to be more specific here. (I do know what signet is - but it is very, very niche. You aren't guaranteed to find many references of that in the LLMs dataset.)

But, out of spite, I'll give it a shot lol. let me just see what signet is first.

Dr. Hax 9mo ago

Every LLM I've tested can give me a decent summary about the signet project.

It doesn't seem to struggle any more with Signet than it does with adding features to any other project.

Ingwie Phoenix (aka. birb) 9mo ago

Oh, I actually can't test that - don't have an adequate device or MCU to turn into that sadly.

Well, my first thought was to dump the core documentation into the context (which means I will need to use a 128k ctx model on my maschine) and then first have it summarize and derive the core principles. From there, using those, I would have it write the core functions of deriving the users. And lastly, to write a CLI around the interface it generated last.

This allows me to keep relevant information in context, allowing the LLM to "forget" stuff as it gets bumped out of the context window, and iteratively approach this.

Not exactly the vibe-coding way - but for that you'd use some cloud provider, which I do not use, nor have a subscrpiton to. Just me and my 4090 baby. :D

Dr. Hax 9mo ago

I'd be somewhat surprised if an LLM produced code that would compile, let alone call the right functions. All the functionality is there in signet-base and it's all called from the GUI client.

To be honest, this shouldn't be a particularly difficult task for an LLM. I wasn't trying to pick something difficult to try to stump it. I genuinely want these things to be useful for real-world thing, despite all evidence to the contrary.

It reminds me of attack tools written to solve Capture The Flag hacking competitions, where they actually do work reasonably well on tiny, toy programs found in CTFs, and then utterly fall apart on analysing any real world code.

Dr. Hax 9mo ago

If it wouldn't be $20-30 in shipping, I'd offer to just send you a device to see what your local LLMs can do. 🙂

Dr. Hax 9mo ago

I still have a basic task that no AI system can seem to handle: adding a CLI program to do what the GUI program already does.

Show the world that I'm wrong and today's AI can do this straight forward coding.

nostr:nevent1qqswn2903v0xlht4lrz75twr54pmdv5hly7jz3fcs8zt68p3n35wwxqpzemhxue69uhhyetvv9ujumt0wd68ytnsw43z7q3q6v82nr4xt62nlydtj0mtxr49r6enc5r0sl2f7cq2zwdw7q92j5gsxpqqqqqqznmaufm