Even if it takes a screen shot, wouldn't it be only getting the text ripped out of the image?

My understanding is that the inputs always reduced to a string of tokens. But some feedback would be better than nothing.

Reply to this note

Please Login to reply.

Discussion

You can feed cursor agent images