Thanks @baconator and @Dominik
I will watch that public talk today.
I am still curious if anyone can answer the remaining question I had, which is:
How many phrases do you need to record to have 1,000 steps created?
I am interested in this as a programmer. I am fully aware that 1,000 steps will not be result in anything usable. I am more just interested in learning expected behavior of the software … more particularly, I am looking to learn if there is an estimate number of recorded phrases, or even a specific number of hours or recordings needed, to get generate 1,000 steps in this software.
The reason I want to know that is so I can generate some user interface elements for the user on the front-end of the mimic-recording-studio software I have forked, so they can know when to expect a model to be generated if they run the training process.
e.g. it’s useless to run the training software if you have zero recordings, but it might also be useless to run it if you have 1,000 recordings. Sure you can still run an analyzer to inspect the quality of what you ran through the processor, but at some point you will be able to train a model in mimic2, and since the default is to export a checkpoint every 1,000 steps, the obvious question is … “How many recordings do I need to have to meet that minimal threshold”
If I know the answer to that, I can check that with software. If you do not have X recordings, or X hours of recordings, then you will never even hit the minimal threshold needed to generate the number of steps needed to generate a model / checkpoint.
And I know you can change the number of steps needed to generate a model, but it still does not answer the question of understanding the ratio of number of recordings to number of steps …
so, if someone can clarify that for me, it would mean a LOT. If it can’t be answered because there is no way to know, that’s fine too, just seemed like it must be answerable by someone with experience with this that could go, “oh, ya, to generate at least 1,000 steps in the training process, you would need about X hours of recordings” … and that’s what I want to know, since the original question was answered that a step is not a recording. Just looking to understand the relationship between a recording and a step.
Sorry for the long post, ha ha