Whenever I vibecode UIs it feels like the agent is flying blind because it can't see what's being rendered.

Have any of you tried integrating a browser-automation MCP like this one? Seems like this could really help the agent QA its work.

https://github.com/modelcontextprotocol/servers/tree/main/src/puppeteer

Reply to this note

Please Login to reply.

Discussion

Hi justin! Nice to meet you 🤝😉 Thank you for the work you do 🤟🏴‍☠️

If you got an agent to actually submit PRs, you could have it use something like this to take a recording of the feature.

There are some MCP servers already that should do the work - take an screenshot of a browser and return it

Yea the one i linked does that. I just wonder if it actually helps or not?

Even if it takes a screen shot, wouldn't it be only getting the text ripped out of the image?

My understanding is that the inputs always reduced to a string of tokens. But some feedback would be better than nothing.

You can feed cursor agent images

This android one is insane https://github.com/hyperb1iss/droidmind

I've done this, but I don't remember if I did it with goose or manual screenshots and the Claude chat interface. It worked pretty well

what is your favorite agent thing? i've tried cursor and claude code but not goose yet.

I'm big on goose. It's a bit spicy, but so am I

Have you checked out Browser Tools MCP?

https://browsertools.agentdesk.ai/installation

No I haven't, looks sweet, thanks for sharing!

I also highly recommend this developer’s prompts:

https://www.agentdesk.ai/prompts

Let me know what you all think.