I think, wikifreedia would get a huge boost if Wikipedia was imported and as it's inevitable anyway, Looking into how to do it would not be a waste of time neither. On the contrary, it would provide testing data for how accepting relays would be for a ton of data.

If I were to do this, I wouldn't aim to import all of Wikipedia neither. I would go in steps. The 10k most read wikipedia articles first and then 100k etc. but I did actually try to import just one article and it failed due to its size, so there's that. Relays are not ready for really long form content. :(

Reply to this note

Please Login to reply.

Discussion

The problem is that the wikimedia format is completely absurd and unreasonable.

I have no opinion on the format but the content is valuable so it would be worthy for seeding a free encyclopedia project.

If I would take on that task, I would transcode and maybe crop content in an automated fashion. Wikipedia is just such a big compendium of knowledge that starting from zero is a waste of time. Yes, bias is a big issue in WP but with many thousands of articles it is not an issue. Where would you start if you wanted to have https://en.wikipedia.org/wiki/Navier%E2%80%93Stokes_equations on wikifreedia, maybe even because you wanted to "remove bias"? Maybe wikifreedia could have an "import wikipedia" as an option cause seriously, starting from zero makes no sense at all.

And I put "remove bias" between quotes as I could have written "add bias" just as well.

I'm a mathematician and I like stuff that's free of bias but I also have to agree that "free of bias" is almost impossible for most subjects. For math though - the best proof is the shortest one. That's as close to free of bias as it can get and for math that is also not the wrong approach.

There's no need to directly copy the format. One could have a processing step that refactors it to whatever one considers more reasonable.

btw, I'm not sure what's so unreasonable about the wiki-format.

You are welcome to try and would be very pleased if you succeeded.

I'll be the first to admit I don't know anything about the Wikipedia format, beyond regular assumptions xD

I could take a look at it however. I've developed a couple of parsers & document conversion tools.

What's the ideal prefered/ideal output?

Markdown.

this might be useful

https://pandoc.org/

Good point! Or go for a topic like list of countries or music discography of top artists or etc.