Nice to meet you too! What do you think the antidote could be? I feel like we are moving away from having free information, the thing we've been lucky enough to enjoy these last few decades on the internet.

When LLMs kill free info on the internet and become monopolies on information, you KNOW they are going to jack up prices for access. They will regardless, because right now they are operating at a big loss to get people used to them.

Reply to this note

Please Login to reply.

Discussion

If LLMs like DeepSeek R1 are anything to consider... that might be a start.

Consider what?

Locally downloaded LLM's, and potentially with your own dataset written by humans.

That seems to be the path forward.

DeepSeek is a terrible candidate for this. It's already programmed with chinese govt propaganda. Ask it about the Tyanamen Square. Usually doesn't work so well.

And where are you gonna get the dataset to train it? The window for scraping the internet is closing fast as more and more people put up anti-bot/anti-LLM walls in front of their content.

I suppose there's going to be a market for training datasets, but it won't be nearly as large as the ones GPT amd the other leading models trained on (the entire internet)

It's still not gonna solve the main problem of free information dosappearing from the internet.

GPT's dataset is unlawful and unethical. One engineer called this out, and was 38'd (murdered) for it.

All it takes is asking permission to allow one to get the contents of something and train an AI model with it, no scraping required.

DeepSeek R1 is a Free Software model (under MIT). Sure, it may have its limiations, but you can make a dataset to cause chronic forgetting of its programming by DeepSeek.

Jesuit agent Mike Adams made a dataset that induces chronic forgetting for any LLM, including DeepSeek. Take the propaganda out of there, and the reasoning capabilities of this computer program (AI is literally a computer program that can be weaponized by stupid engineers) would make humans obsolete if they were NOT weaponized.

I don't think training on stuff is unlawful and unethical. It's not the same as copying.

And, from what I understand, the dataset to train an LLM properly is HUGE. Not easily accessible.

DeepSeek has a gnarly license agreement where you don't own anything you make wirh it.

As a Free Software enthusiast, I'd have to disagree with the third statement. MIT is a Free Software license, much like GPL and BSD are. Someone may do some things to make MIT-licensed software proprietary, but I think that's usually rare. Otherwise, DeepSeek is Free Software when downloaded locally.