I did some testing of RVC model training. I trained my Metokur dataset on 10 seconds, 1 minute, 10 minutes, 30 minutes, 60 minutes and 120 minutes.
Unsurprisingly 10 seconds was not enough data for the model to be effective, surprisingly however 1 minute produced acceptable quality and was consistent in generation results.
30 minutes to 60 minutes does appear to be the sweet spot (as has been previously stated by others) and going above 60 seems to “overbake” the model and fuck the index training afterwards.
As this is the case, I’m going to generate future songs 3 times over. One with a 30 minute model, one with a 60 minute model and one with a 120 minute model. I will then check which one appears the most stable and similar sounding and adopt that one full time.