Subnostr

I talked to my cousin about it and he is just converting the html5 version of the project Gutenberg ebooks into epub3 after doing some clean up.

Silberengel 1y ago

Hm. They already usually have epub3 format.

Using HTML has the benefit of tags and images, of course. I should probably switch to those and write a parser based upon the HTML tags.

Is he just scraping their website?

Reply to this note

Please Login to reply.

Discussion

Daniel Wigton 1y ago

I am not sure how he handles the initial download. But he wanted to put in a bunch of heuristics to make a more presentable epub without miss-sized images, unfortunate line breaks etc.

Silberengel 1y ago

That's probably where an LLM would come in useful, for refining.