GM. Mood. 😊

Reply to this note

Please Login to reply.

Discussion

GM☀️

GM ☕

GM ☕

we should discuss web archiving. It's such a perfect fit for nostr and your project specifically

I believe Annas-archive.org is built on bittorrent. Not really sure.

I think we'd need a bigger server. 🤔😅

That’s what she said.

decentralize the archives. anyone doing heavy archiving on nostr would run their own relay where they would post WARC web archive records as nostr notes.

the CDX index entries to those pages would be posted as notes and shared around, and anyone wishing to replay a page would be (automatically, in the background) finding all the relevant WARC-notes to reconstruct the page.

the same idea as the wayback machine web archive, but distributed.

Would it be storing the pages converted into Asciidoc or leaving as HTML?

We are planning on storing entire webpages in the kind 31 citation events for "external web references". So that people can refer to those, rather than the web pages, in case the pages change. The page content is then the event "content": "".

Is this the sort of thing you mean?

https://next-alexandria.gitcitadel.eu/publication?d=gitcitadel-project-documentation-citations-specification-9-by-stella-v-1

that can work, but the playback experience will be severely limited. for instance, any "external assets" (images, CSS, etc.) will either break or try to load the live asset.

WARC handles this by capturing the original request to that asset and recording it. so when loading later you can also replay these recorded sub-assets.

see my other reply

Yeah, it's more meant as a snapshot for document references.

I dig. that makes sense for this purpose!

the system I'm describing *does* require the whole replay client side setup, so it's constrained in that way.

having a basic "reader mode" view of the snapshot you're describing sounds really appropriate for nostr.

...might have some issues with sites that load content dynamically on scroll.. but it's impossible to handle everything. the web is so sadly broken these days. hardly anything is a document anymore!

the WARC (and WACZ) file format (used by Internet Archive and others) is a bit special. HTTP requests and responses are recorded and written into the WARC file, which is subsequently used by replay software to "play back" these responses in place of the original server that once provided these responses.

this is why Wayback Machine can provide such high fidelity replay experiences.

it's a fundamentally simple approach that provides a really powerful experience. you only need to be able to store basic text in order to provide this experience, as a server.

my company has a lot of work invested into this already. not specifically for nostr yet, but into generalizing web archive replay and building custom WARC servers - all to lay the foundation for decentralized web archiving

.... #cashu - gated micropayments for web replays of formerly paywalled content....

one paywall subscription + active web recordint could be shared by countless users, each paying an Infinitesimal fraction

GM Laeserin. Qapla' 👀

GM✨

GM laeserin 🤩✌️🌞🌞💃

more humans please, but no more captchas

based

lol, someone made a zap with 21 sats and then comment "million"

anyway

😅 Yeah, saw that. Better than the people who use a ⚡ as an emoji, but don't zap.

HUMANS! o noes!

not propaganda spewing biased robots?

no, we got salty blackjack loving machine oil hard drinking robots

Bender would not approve of this