I wonder, have you done some performance benchmark using various relay implementation? At least, how many ops/s relay can process (read-only, write-only, and both operations)? How many ops/s that we can finally say "it is scalable"?
I think at least that can be the start of discussion