Your question doesn’t even make sense. The whole point of training on as large of a dataset as possible is so the model picks up on the general rules of text and doesn’t just overfit and regurgitate the training data.

If you have a model where you ask it for a book and it spits out the exact book you’ve trained a terrible model.

Reply to this note

Please Login to reply.

Discussion

My question is based on your premise:

> Anthropic is not destroying the books, they just extracted the information and now that they're storing the text digitally there's no need to store it physically as well.

Let me spell it out if you don’t get it: If all you are doing with the books is training models with them, that’s the same as burning the books.