Avatar
Ostrich Mcawesome
f51d072c292e4d3d37a45bd6a881609995c5b7cd57035da7c4c88bc7f7361f5f
Privacy & Security Consultant | Cybersecurity Infrastructure Professional. Offering free privacy consultations to help you take control of your digital footprint. Tips appreciated in XMR if I've helped you out
Replying to Avatar Final

We're developing our own implementations of text-to-speech and speech-to-text to use in #GrapheneOS which are entirely open source and avoid using so-called 'open' models without the training data available. Instead, we're making a truly open source implementation of both where all of the data used for it is open source. If you don't want to use our app for local text-to-speech and speech-to-text then you don't need to use it. Many people need this and want a better option.

We are working on TTS first then SST. The TTS training data is LJ Speech https://keithito.com/LJ-Speech-Dataset/ and the model used is our own fork of Matcha-TTS.

If people want they can fork it and add/remove/change the training data in any way they see fit. It's nothing like the so-called "open" models from OpenAI, Facebook, etc. where the only thing that's open are the neural network weights after training with no way to know what they used to train it and no way to reproduce that.

Many blind users asked us to include one of the existing open source TTS apps so they could use it to obtain a better app. None of the available open source apps meets our requirements for reasonable licensing, privacy, security or functionality. Therefore, we've developed our own text-to-speech which will be shipping soon, likely in January. We'll also be providing our own speech-to-text. We're using neural networks for both which we're making ourselves.

What's your take on FUTO boards model that they use for Text to speach I'm a bit ignorant on it