> A data set based on all arXiv publications, pre-processed for NLP, including structured full-text and citation network
> A data set based on all arXiv publications, pre-processed for NLP, including structured full-text and citation network
Generated it from this dataset