I think Google has figured out where to obtain new raw data (in the sense that this is how people actually speak) to train Gemini on: Google Meet.

Gemini Transcripts are now fully integrated with all Google Workspace accounts (on by default). Businesses require documentation for liability claims, and AI is doing a better job than any human can ever do. Even hospitals are using it now to document team conversations.

This provides Gemini training data on how people actually talk (as opposed to YouTube videos they trained on) and on private data before it reaches the markets.

Terrible for privacy, but a genius move.

Reply to this note

Please Login to reply.

Discussion

Gonna circle back Monday!

Recently switched away from Meet (formerly Hangouts, formerly Chat) for personal communications. Signal for now, hopefully systems like WhiteNoise in the future.

GMail and other Google products have always had the deal "we harvest your data for ads, and that's why it's free." I feel like in the early days of GMail that was more understood by the users, because back in 2000's email was still a service you paid for as part of your ISP.

Then in the 2010's, every few years we were shocked -- SHOCKED! -- to learn that Facebook was doing the same thing.

Yep. Same goes for Microsoft Teams and Copilot (OpenAI)

True

Same for *any* service that now integrates AI for allegedly making something more convenient.