Global Feed Post Login
Replying to Avatar Gigi

Here's a left-side-of-the-bell-curve way to do the Internet Archive "right":

- Create browser extension

- User loads page

- User clicks "archive" button

- Whatever is in user's browser gets signed & published to relays

- Archival event contains URL, timestamp, etc.

- Do OpenTimestamps attestation via NIP-03

- ???

- Profit

I'm sure there's a 100 details I'm glossing over but because this is user-driven and does all the archiving "on the edge" it would just work, not only in theory but very much so in practice.

The reason why the Internet Archive can be blocked is because it is a central thing, and if users do an archival request they don't do the archiving themselves, they send the request to a central server that does the archiving. And that central server can be blocked.

6c
NoStrFromObject 4mo ago

wasnt p2p yacy search engine doing archival and browsing and spidering stuff in the old days?

Reply to this note

Please Login to reply.

Discussion

Avatar
noname 4mo ago

web archiving is a childs play.

what we want to do is decentralize web crawling data. nobody uses yacy which implicate it failed.

what is the total size of latest common crawl? (estimate)

Compressed size (gzip‑ed WARC) 250‑350 TB

solve this.

Thread collapsed