ws maximum message size is a problem so you should use multiple messages
it can be pretty low depending on the client
a flurry of discussions and things relating to my work lately have led me to something
there is two current existing query types in nostr, the REQ and the COUNT
REQ has no concrete notion of signalling how many events there are in a request beyond what it has been hard coded to limit results to
COUNT doesn't have metadata to easily determine other than making multiple COUNT queries
"since" and "until" fields in filters can be used to create a boundary that limits the number of results from a REQ but it is inadequate
i could swear i already made this suggestion before
but i'm gonna make it again, if i did, or for the first time if not
there should be a query that just spits back a list of all the event IDs that match a query, and if you set no limit, it just returns the whole set
if you consider that some follow events take as much as 512kb of data, and this is often a common size limit for individual events, then this is good for somewhere around 14000 individual event IDs to be returned in a result, it could be as simple as an array, so overhead is ["
perhaps this is not sufficient though, maybe you want to include the timestamp next to each event ID... or maybe you could make it so you define the first timestamp of the first event ID, and then after that it's seconds offsets from the previous, this would mean that the list would be something like
[[12345678,"
i'm inclined to say that fuck the datestamps, i'm just gonna make a new req variant that returns the IDs instead of the results as an array, and to keep with the style, it will just be
["REQID","subscriptionID","
the relay already can specify some size limit according to the nip-11 relay information, so it can just stop at just before that size limit and the user can query for that event and get a timestamp "since" to use to get the rest
nostr:npub1ntlexmuxr9q3cp5ju9xh4t6fu3w0myyw32w84lfuw2nyhgxu407qf0m38t what do you think about this idea?
if the query has sufficient reasonable bounds, like, it is very unlikely you want more than 14000 events over a period of let's say, the last day, of a specific kind, and certainly not if you limit it to some set of npubs
but you would still know where the results end, and so long as you stick to the invariant of "this is what i have on hand right now" the question of propagating queries can be capped by the answer of "what i have" and it is implementation internal whether or not you have a second layer and if you then go and cache the results of that query so next time you can send a more complete list
and i am not even considering this option
what about if instead of returning the results encoded in hex (50% versus binary hash size) but instead send them as base64 encoded versions of the event IDs, that gives you 75% or in other words expands the hypothetical max results of just IDs from 14000 to 21000
ws maximum message size is a problem so you should use multiple messages
it can be pretty low depending on the client
yeah but that is configurable, you can set a standard
except gay clients, and they can have a gay standard
this is relevant to what you have been saying about protocol headers but i think it's stomping on nip-11 and if clients aren't querying for that then they can go to hell, actually, literally be set on fire and burn alive
oh yeah... easy to solve that
this new query type has a "max bytes" field. done
if they want moar, then tough shit because the relay sed, what it sed, nip-11 bitch
also you could easily perform a DoS attack on certain events by creating a lot of events with a higher event ID (or lower depending on the relay impl) and the same created at as the target event
you reach your limit before the target event gets returned and there is no way to find it except by id
this will also become a problem once nostr has a high event volume per second
the only way to fix this is to allow support for proper pagination
higher event ID? by timestamp? based on a filter?
that's a pretty narrow attack surface
clients can mitigate it by being more specific, things like authors are a huge limiter
also i don't think you really fully understand how you implement pagination
you have to generate a list of matching event ID to do it, then the relay has to store it on a space limited query cache that expires after a time or volume
the idea to make a new query type that just returns these came from me thinking about that fact
it's the first step towards it, but it probably eliminates the problem altogether, seriously, 14000-21000 event id matches for a query without busting the current typical max response sizes?
DoS attack mitigation would not be relevant, that's a network layer issue not an application layer issue, in as far as it is a per-query limit not in the context of total volume from a client, that's not the same thing
pagination can be represented statelessly with an event ID and its timestamp
no, because what happens when a new event that matches the filter gets stored?
what happens if i ask for the most recent 50 events matching a filter
then ask for page one of a 10 page of that result after 5 more new events come in?
nope, definitely stateful
i don't know what you are saying
so i send a query again for "events since X" and it's been a minute and 5 more have been stored that match
how is that not stateful
the query is pinned to a time, yes, the results, not
Can't you just paginate the results or something?
that's why i've been saying we need at least a "results as list of ID's" query type
pagination is a query engine thing
it gets a query, and some spec of page X and per-page Y and whatever that big list of IDs has to be cached so the same query can be consumed with the other parts of the same query result list
i say, first of all, relays are just limiting results to typically 500kb
that's enough to spew 14000 event IDs
so why not have a simple query that just gives you that cache and forget the pagination, you do that yourself, bitch!
honestly, the "count" query is a buncha bullshit, completely retarded stupid idea
it should always have been "GETIDS" or something like that "QUERY" even fuck it what is "REQ"