That's a good point. Wouldn't want to remove the Bible just because it's old, for instance. Popular stuff should remain available.
Discussion
it's for a caching strategy, so the document will still be available and if fetched again will again move upwards in the list as to be retained
if people make use of it to read and search the bible it will not ever fall above the high water mark and its access counter will continue to escalate, ensuring it is unlikely to fall below even with a lull in usage
The complex thing is that each verse will be a separate event. Since our events are nested, they may access part of an index, but not each individual section.
if your relay has a garbage collector on it, it's because it has a bigger store elsewhere that it fetches events from, even if the event was purged from the relay's cache as soon as it's requested again it's fetched and stored after sending it back to the user
you aren't going to use the GC without this being the case because the purpose is distributing the load of data access and concentrating the archive in a small cluster and users read from the caches
to be clear, this is precisely for the purpose of optimizing resource utilization among a number of participating relay operators
maybe if your data was just an endless stream of twitter style notes then you might use a GC standalone and avoid having to monitor disk utilization of your relay, something that would probably save for example Will a lot of time with constantly nukening the database