Latency of such pipeline will include 2x latency of relay for each job. Imagine a scenario of frequently small jobs (e.g. translation is needed many times during reading some text). I think job consumer and job worker should find each other once on nostr and send data directly one to another. Also it makes sense to pay once for a package of jobs, because payment also involves some latency and constant costs.
Discussion
Websockets are incredibly fast
A one to one model is already possible; This solves a different problem
Well, if the user, relay and the worker are in different geographical areas, latency will be noticeable. It will worsen user experience
Wonโt be noticeable. But service providers can geolocate where most profitable. AI latency is going to be orders of magnitude more significant than two extra hops that are measured in dozens of milliseconds at most
Again, websockets are brutally fast