I was thinking more along the line of the maximum number of open connections/simultaneous clients which would be spawned by your wait/listen thread and either you would have a maximum number of threads or a maximum number of tracked connections if they are controlled by the main port/socket listening thread.
I don't read rust and havent worked with wrbsockets in C or cpp, (only in python) and havent investigated high volume websockets service architecture. My main language is C and cpp.
In any case, your main program would have a way of knowing how many sockets its serving I would imagine. Of course, if it rejects connections from unauthenticated clients, this is a moot point. If it is designed to serve up public outboxes from authenticated users, theoretically ten posters, each with 1000 active followers, could balloon to 10k open sockets. Multiply that by 10 or 100, and connection handling architecture can get to be a problem if there are 1M threads.
I'm doing allot of counting angels on the head of a pin here. Is there a separate thread allocated to each websocket? Does your pre-banning scheme avoid ddos, or intentional websocket/process overloads?
In my use-case, its highly unlikely to be a real concern.
Yes I count how many sessions are open, I could stop accepting requests when that number gets too high. In my deployment scenario it only gets to 70 tops, but there are only 3 of us on there.
I'm using asynchronous rust. It doesn't use one-thread-per-socket. Instead, there is one-thread-per-core, and whenever a thread hits a wait point (waiting for socket I/O or a timer or anything else) the task gets shelved, and as soon as an event happens related to any of the shelved tasks, an idle thread goes and handles it up to the next await point. All cores can be fully busy if I/O becomes ready fast enough, but most of the time all cores are just waiting for I/O.
TCP/IP ports are 16-bits, so the maximum port is 65535. Ports below 1024 are reserved for system services and aren't randomly allocated. Each websocket connection uses one of these 16-bit ports among these 64511 available ports. You can't have more network connections than that AFAIK. So we can't serve 1 million clients simultaneously. I might be wildly wrong about this.
I'm not sure if thats how websockets works, by assigning a tcp port to each connection but then again its been over 30 years since I read my favorite 800 page volume on TCP/IP from the library.
I think I'm wrong. I've got TCP/IP illustrated vol1 and 2 right here on the bookshelf. Shall I learn?
Here is what gpt4 says:
"In TCP/IP networking, a connection is uniquely identified by a tuple consisting of the source IP address, source port, destination IP address, and destination port. While the port number is a 16-bit number, limiting the number of ports to 65,536 per IP address, the distinction between connections is made based on the combination of these four elements, not just the port number."
It sounds like this is a posix-sockets layer capability. On the other hand, each open socket requires a open file handle, so it can be constrained in limits.conf
Thread collapsed
Yeah I'm reading that elsewhere too. So a given computer can only connect to my chorus port 64k times over, but each other computer on the Internet can also connect to my chorus port 64k times over. [In actuality most systems have less than half that many port numbers available for this purpose, still it is overkill].
So all it would take is about 20 machines with separate IP addresses and a malicious websocket client to open 50k connections and open 1M connections with keep-alive traffic and attempt a ddos (even if it takes some time with the initial connection rejections).
Fortunately since each thread can handle multiple websockets, your implementation may be able to handle it. On the other hand, once all those connections are open, it would be trivial to perform a simultaneoushigh-demamd filter-query request.
I'd like to try it.
"In theory there is no difference between theory and practice, but in practice there is."
Don't do it to my current machine. This was not permission to DDoS my domain! Hehe. I'd have to arrange it with the data centre first.
Thread collapsed
If you have only one thread per cpu, does that mean any requests from the sockets on that cpu are queued?
Sockets are not pinned to CPUs. Any thread (cpu) can pick up any task that is ready for further processing.
Are the requests process asynchronously from the database, for example, the requests are processed by one queue, but the database lookup is dispatched by the request processor, then the result is returned after the database has returned data. Such as; if you have two clients (A and B) making two separate requests, if A is processed before B (both in the same request handling thread), but the lookup for B returns data to be sent back before A, then B would get its respomse before A?
Yes it is all asynchronous. Within 'async rust', everything is asynchronous. There is no tricky coding on my part. Every function that says it returns a type T actually returns a future that resolve to a type T and the executor handles running that future to completion, even as that task sometimes returns Poll::Pending instead of Poll::Ready(data) in which case it gets shelved so the thread can keep itself busy with anything else that needs to be done. This is all internal to rust's async system and the executor which I use which is 'tokio' (there is another one async-std, but I don't think it ever got much uptake).
I take this back, see other response to the parent.
Thread collapsed
So rust's async is more like green thread than like microsoft's WDM deferred procedure calls?
Thread collapsed
BTW it doesn't just repeatedly poll and keep getting Poll::Pending. When it gets one of those, it doesn't poll again until the waker system wakes up the task when the I/O is actually ready.
Thread collapsed
Thread collapsed
Shit, I seem to have entirely failed to async in the chorus-lib. Huge oversight.
Eh, nah it looks like `heed` (the LMDB library) isn't async so I couldn't do it if I wanted to. Data access is memory-mapped, and any delays happen in the kernel (I think) and are kind-of hidden to the application (kernel handles the page fault, our code is put to sleep while that happens), so hard to async in userspace in this case, but it should be fast nonetheless. Potentially a thread sleeps over a page fault when the CPU could have otherwise been busy, but probably not something we can fix.
All network waits are definately async futures.
Thread collapsed
Thread collapsed
Thread collapsed
Thread collapsed
Thread collapsed
The test could be run on an internal private network fairly easily. Each client could be on a separate VM, possibly even all 20 clients could be on one machine and it done with the bridge interfaces of docker containers.
Thread collapsed
Thread collapsed
Thread collapsed
Thread collapsed
Thread collapsed
I read vol1. Didnt know about vol2
Thread collapsed
Thread collapsed
Thread collapsed
Thread collapsed