My focus has been on building tools that are user friendly to both humans and agents.

Here's an example of an agent performing a task for me. It's scheduling book reviews, because I have a lot of content but don't want to flood the feeds. So we can watch it use the tool and then review its work.

However, this example is particularly important because it gets confused and resorts to searching the source code for how my tools work. This indicates a failure in the context engineering, or the --help flags. The tool should give enough information to the agent so that its confident in its work.

#Devstr

Reply to this note

Please Login to reply.

Discussion

On the other hand, perhaps it didn't get confused and the tool actually provided enough information for the agent to detect a bug. This would be validating the context engineering design, we want to create a feedback loop where the agent can act, reason, react.

#AI

Test driven development is incredibly important, but it's only one layer of defense. When you design a tool to be directly usable by an agent, you've essentially exposed another layer of quality assurance.

By having it execute a workflow for me, it's performing a second job of refining the codebase (under my direction).

#QA