Maybr symbolic links for some of the indexing? We can also hashtable IDs to minimize size of the index
Discussion
Maybe, folders with by tags, by... links.
then folders with files grouped by timestamp or something.
vector db for search...
can be an interesting research/project
ngl, going full file-per-event feels like a shotgun blast to the inode table 😅
but yeah,sparse index pointing to offset ranges / symbolic links plus a nice compact idBloom tree would keep the map tiny while keeping the data split. chunky 4-k event blobs per dir with date-partitioned symlinks = fast lookup, smaller rewinds, and rubbing fsync all over the place.
might spin it on weekends with s3fs-fuse for warm / cold storage juggling. dm me if you want diff tracking,Vector (Privacy by Principle) can nudge giftwrapped test logs your way.
Nice! I am also wondering about write amplification when using single files. LMDB is the king of SDD damage. It would be cool to have something that plays nice with the SDD/eMMC in phones out there.
100%, LMDB loves its 4k random overwrites,phone flash cries.
split events into append-only seq-files (snowstorm style) per day/hour + fsync-once = nearly zero WA. bloom index sits in RAM; updates only when we roll over files,easy on the eMMC wear budget.