I would rather have a relay that protects me from clients than trust the clients to be sanely written. I think relays have a long way to go.
Discussion
you cannot trust the clients in a decentralized network. that is why the relay must use common techniques to prevent abuse.
BUT the client must also implement sane data access and connection patterns for the client to work at all.
in this particular case I will guarantee you the vibe coded client is doing terrible things with websocket connections and queries. that is step one to making the client work.
Another issue using websockets over http in this case, is it defers the abuse protections completely to the application servers (or custom proxies) meanwhile hogging whole tcp streams in load balancers. In existing http infra, load balancing and resource abuse protection can happen in multiple tiers sparing the application software (and more specifically relay developers).
That said I'd argue almost all web abuse protections are handled way before the client request even makes it to the application server. And it's done very quickly.
totally agree. websockets are annoying. but tcp connections are cheap and as you said you can quickly handle bad actors once the channel is open.
to solve this specific problem i would do something like
- one relay for read, heavy caching to reflect data access patterns in vine client
- one relay for search
- fanout to the proper relay a single api gateway tier
- scale every layer independently
- write a sane client
> tcp connections are cheap
I would disagree but we could be on different scales, that's at least two extra OS handlers, at least 16k (usually 64k) of kernel space (x2 for rx and tx) and often userspace (usually shared) buffers, per connection, per LB. L7 lbs generally multiplex tcp connections and dramatically cut down on memory consumption. While it can be a bit aggressive. For every websocket that's opened in my http server, tuned to 16k, thats about 256k after the upgrade is established (because websockets are usually fully buffered) and http buffers are freed just in userspace. For 5000 connections that's over 1.2G of committed system memory minimum just hanging about. I would expect software like nginx to be more optimized in comparison but that still hogs up LBs and I know I've heard other sysadmins share their stories of websocket overhead.
L7 lbs multiplex these http/1.1 connections, generally cutting down to double digit connections to service 5-6 digit ingress traffic.
I agree with the basic architecture, that is, a highly functional with a "popularity decay" model. I can't remember the scientific name.
os handles*
popularity decay => least recently used (LRU)
Yes but that's not a great model. LRU breaks down in pure form here. A single user pulling an old file forces fresher content out. TTLs have to be attached and respected by each tier. There is an official name for this model I just don't remember it. As files age their ttl decreases and avoids polluting caches.
frecency-based models are what you are looking for
Yeah but there is still a more specific model Im trying to remember the acronym for. It literally describes exactly how to label content with TLLs based on the frequency of the exact content being hosted. Something one of the big tech companies published. Basically, given that the site is hosting images to be shared on social media, it's known that the images will have the highest frequency immediately after being shared, then general decay at a known rate. It can even get as accurate as saying, given an image of a puppy, you can apply TTLs that model how the image of the puppy should be cached.
substitute "puppy" here for some known constant, like: my users usually upload X content. In the case of devine. Users will only be publishing short looping videos designed for user attention. Which might frequently have images of cute animals...
^ And relays only scale well currently because clients assume poor consistency and load balancing and reconciliation happens at the client. And more specifically because nostr users can't really know any better.