Absolutely, if for no other reason, they are using Reddit or all places as their data source. No bias issues there.
Discussion
I believe you know the answer. Ah, mini servers are cheap the question is who's AI model can you trust?
At this point, it needs to be a Free Software model (one that follows a Free Software license), and must be truthful in what it's capable of, and the dataset.
Common Crawl is a cancer, and needs to be stopped big time.