Self-hosted solution saving and indexing everything you've seen online, keeping a complete archive, tracking every change in the content. This sounds like heaven and hell at the same time. But I guess with smart search, filtering, AI, and whatnot, this will be amazing. I wonder if there are projects trying to achieve this.
Discussion
Hopefully there are :)
Yesterday or the day before yesterday, someone wrote about how KYC will kill Web2.0 and how everything interesting is, or will be, hidden or obfuscated. They said that Web5.0 will become a network of personal information silos. I can't imagine how that would work, but the observation is correct, and the concept is definitely interesting.
I actually stumbled upon nostr:nprofile1qqsza748zkamgmw4he4hm2xhwqpxd5gkwju38wqh3twmtshx8kv8xvgpr9mhxue69uhhqatjv9mxjerp9ehx7um5wghxcctwvsq32amnwvaz7tmhda6zumn0wd68ytnsv9e8g7gprpmhxue69uhhyetvv9ujumn0wdnxcctjv5hxxmmdd79m6g because he used to work on something very similar up until last year or so.
I am no longer with the company, buy the project may see the light of day soon? no idea, but here it is:
I've got a few ideas for doing this over nostr that I'd like to work on someday. currently occupied with #catallax though
Notes itself are easy "a" tag and "p* tag and that is. Media in those notes are more difficult, but queries to a local blossom server if the original is no longer available would also work. For content not native to Nostr, I'm not sure what would be the benefit and if relays would be happy about storing it... Maybe metadata distribution? I don't know, I'm a noob and you been working on it for years. Sorry.
no need to apologize! all ideas are good ideas when we're on the frontier.
you might like to look into the WARC file format. they are "recordings" of web request/response cycles and can be "played back" to emulate a previous web session. this is what the Wayback Machine uses to create such high-fidelity archives.
these are files that could be saved in blossom servers, potentially. the "playback" step is a bit complicated, unfortunately - it's not as simple as just loading the file from a server, the way an image or video works.
but it absolutely is possible to create a client or web extension that would do this full loop. many products already exist that do this. the trick would be adapting them for nostr/blossom/WebTorrent, etc.
I think a good (tangential) first step would be creating text/html-only simple archives of sites - similar to what you get with "Reader Mode" in web browsers, and create basic nostr notes out of those, on relays.
once we establish a norm of creating and sharing these simple versions, we can move on to things like provable archives, richer recordings/playback, etc