What prevents LLM data from being poisoned by sheer quantity of garbage?
If they're crawling the internet for data to be fed into the LLMs doesn't that mean that data that appears _more_ will have more importance, instead of data that is "better"?
In other words, what is the "pagerank" of LLMs?
I think they’ve ’human-verifiers/checkers’, but then how reliable are they?
Please Login to reply.
Who trains the AI trainers?
that’s a vicious cycle, isn’t it?