It's possible, but I think LLM capability is plateau'd. You have to extend the brain by adding memories and other functions. The rest of what's missing is not LLMs but regular code.
Discussion
How does that look like in practice? MCP servers and context files, or something else?
Retrieval augmented generation helps a lot, and still has room for improvement imo. If we start seeing more specialized LLMs they could also start cross-referencing each other using RAG as well. I dont think this exists yet tho.
AI script on a cron job. It would manage a mind-map that fits entirely in its context window. It would use this to produce tasks that would be fed to the system prompt of other AI calls. It would need specialized tools to verify results (eg, a tool to navigate an Android emulator and screenshot it). Upon getting results, it would update its mind-map and continue repeating.
To scale this (if the mind-map exceeds a context window) you would first show the AI a shallow version of the map, and it would call a tool to expand branches of it before it generates any tasks.
Makes sense, is this something you came up with or it's an established idea?
I'm still skeptical seeing how these LLMs often "forget" things that are well in their context window.