It’s a lot of hacking about at the moment and reading, as it’s quite an ambitious project for me.
It’s very reliant on the GPT-4 model at the moment, but I’m hoping to substitute with more powerful LLM’s asap.
One major aspect for @jack to consider is that GitHub repo monopoly is the bedrock for GPT-4’s monopoly on AI coding ability.
To have an open source LLM that can surpass GPT4 code abilities we really require a large open source code repo that can be easily structured as training data.
Without that, OpenAI’s LLMs are going to be eclipsed on chat yes, but will be stubbornly superior for coding.
Of course we now have a massive code generation engine in GPT4, so it’s likely they sow their own downfall, we just need to direct the output of that engine into a suitable open source location with inherent training data schema.
Everything else is open source.
I think its possible to have a USB with less than 80kb on it, that can iPXE boot ipxe.org from the web into a RAM only linux instance slax.org
From there it can download self assembling code, and begin downloading and installing db libraries and asking for API keys. Then map environment, complete the stand up process and then it will ask for a mission.
It’s ethereal for security reasons because it has full permissions and can write whatever it sees fit.
It will break down a prompt into token limit workscopes develop the scope and break it down again until it as able to code functionality.
The splitting up of the prompt into workscope chunks and the further splitting followed by the tracking and reassembly of the full system is the most difficult part of this. But if human orgs can do this, so can LLM’s.