i don't know what you mean by "load balanced" to be honest
nostr relays are really just a database server in themselves
do you want to replicate the database, do you want to demand cache it, do you want to shard it? it's quite important what strategy you have in mind and whether that actually fits the use case
sharding is probably gonna need to use social graph and maybe combine that with geofencing
caching is the simplest, dumbest idea, where the relays that people use don't actually store much data but they proactively fetch new data and expire old data quickly to keep the space available
caching was the way i envisioned working it... you could extend it with broadcast so that new data is shared around immediately and then with the garbage collection, rapidly purged from the store when it gets no traffic
how you design that optimization depends entirely on how the data is going to be used, and propagated
inherently, replicated, RAFT/Paxos/pBFT databases ARE exactly replicas, you even used the word replica, you didn't mean replica because you just clearly said you didn't mean replica
replica is like a bitcoin node with the same blockchain, that's a replica, all replicated database protocols make replicas, and individual replicas don't have the option to decide not to store data arbitrarily
what you are talking about is a caching strategy, which means you want to do garbage collection and broadcast