Hmm I guess it’s expensive but I don’t see why agents in a Ralph loop under the supervision of a comprehensive test suite can’t deliver top percentile results. You just iterate enough times until you see the desired behavior. Where are the risks?
Hmm I guess it’s expensive but I don’t see why agents in a Ralph loop under the supervision of a comprehensive test suite can’t deliver top percentile results. You just iterate enough times until you see the desired behavior. Where are the risks?