Self-hosted solution saving and indexing everything you've seen online, keeping a complete archive, tracking every change in the content. This sounds like heaven and hell at the same time. But I guess with smart search, filtering, AI, and whatnot, this will be amazing. I wonder if there are projects trying to achieve this.

nostr:nevent1qqsgmlnzkj4mw2utl5juacwm58lwdra2pw4c089s8m5edxs2etwqh5qpzpmhxue69uhkummnw3ezumt0d5hsygxxcxgj4trvykld5afjk0zn73temt9h038sg72hf52xhwxl9jhzgspsgqqqqqqspu8xgf

Reply to this note

Please Login to reply.

Discussion

Hopefully there are :)

Yesterday or the day before yesterday, someone wrote about how KYC will kill Web2.0 and how everything interesting is, or will be, hidden or obfuscated. They said that Web5.0 will become a network of personal information silos. I can't imagine how that would work, but the observation is correct, and the concept is definitely interesting.

I am no longer with the company, buy the project may see the light of day soon? no idea, but here it is:

https://github.com/operating-function/packrat

I've got a few ideas for doing this over nostr that I'd like to work on someday. currently occupied with #catallax though

Notes itself are easy "a" tag and "p* tag and that is. Media in those notes are more difficult, but queries to a local blossom server if the original is no longer available would also work. For content not native to Nostr, I'm not sure what would be the benefit and if relays would be happy about storing it... Maybe metadata distribution? I don't know, I'm a noob and you been working on it for years. Sorry.

no need to apologize! all ideas are good ideas when we're on the frontier.

you might like to look into the WARC file format. they are "recordings" of web request/response cycles and can be "played back" to emulate a previous web session. this is what the Wayback Machine uses to create such high-fidelity archives.

these are files that could be saved in blossom servers, potentially. the "playback" step is a bit complicated, unfortunately - it's not as simple as just loading the file from a server, the way an image or video works.

but it absolutely is possible to create a client or web extension that would do this full loop. many products already exist that do this. the trick would be adapting them for nostr/blossom/WebTorrent, etc.

I think a good (tangential) first step would be creating text/html-only simple archives of sites - similar to what you get with "Reader Mode" in web browsers, and create basic nostr notes out of those, on relays.

once we establish a norm of creating and sharing these simple versions, we can move on to things like provable archives, richer recordings/playback, etc

I have ArchiveBox deployment in the pipeline, so let’s see if it’s usable…

It looks great. But it must be used with another tool for local index and search. I'm sure those exist as well. Imagining "Archivarr, Indexarr, Searcharr and Browsarr" neatly orchestrated together.