nostr:npub1vuwsup87kt73vczp64mztq3txmkr9mhprlmwk7a9q7fmxzvlgxzqlsyv2p

I like the guy person who wrote the comments for their robots.txt. I can feel their cynicism and it feels so good.

nostr:npub1ygrf5pf0ahkyyyrngv3twymqcz7pav6ssymrhdj90luscm9llszq56cg6z At no point did I actually think to look at their robots.txt… Honestly this is a good ledger of ‘known bad crawlers’ lol.

I found out years ago that (at that time anyways) they had a bunch of regular expressions saved off somewhere to match known spammy/URL shortner URLs so I grabbed those to use to filter my guestbook spam lol

Reply to this note

Please Login to reply.

Discussion

nostr:npub1vuwsup87kt73vczp64mztq3txmkr9mhprlmwk7a9q7fmxzvlgxzqlsyv2p

I bet a few other non-profit/free websites also keep a good index. Archive.org is surprisingly permissive. They were my next guess.

I for one, tend to like robots.txt like your instance has.