#[4] you seem to have a lot of free time and are not involved in other projects so plz build 😂

Reply to this note

Please Login to reply.

Discussion

😂😂😂😂

it does sound like a super quick thing, minus scraping markdown from the HTML 🤔

Pandoc's pretty capable. I wrote a python library for HTML -> Markdown, long ago, when I was a lesser engineer. It worked against a pretty constrained set of HTML. Not sure how well it would work against The Internet. https://github.com/crossway/antimarkdown

I'd use readability.

Use it in my toolchain, works well.

https://github.com/mozilla/readability