Hm. They already usually have epub3 format.

Using HTML has the benefit of tags and images, of course. I should probably switch to those and write a parser based upon the HTML tags.

Is he just scraping their website?

Reply to this note

Please Login to reply.

Discussion

I am not sure how he handles the initial download. But he wanted to put in a bunch of heuristics to make a more presentable epub without miss-sized images, unfortunate line breaks etc.

That's probably where an LLM would come in useful, for refining.