You just need Anna's archive. That's what OpenAI allegedly used to pre-train their earlier GPT models.

nostr:nevent1qqsvlw8upyjusx3fge5yzup9mut2ynt7jxy0ujq56q4dl5vccueq2ccpzemhxue69uhhyetvv9ujumt0wd68ytnsw43z7q3qecs6dzu4vmns80yyq5gux78s75sp6aaesg5xpxq2zftylzvpnagqxpqqqqqqzangp3u

Reply to this note

Please Login to reply.

Discussion

I think the point is that relying on "the only one we need" is a bad idea.

It's a massive torrent that sources from every source listed in the quoted note. It's a meta-source, not the source itself.

Or get 69,000 books on a flash drive for $50.

shop.encyclosphere.org