In the last few weeks I have edited and adapted 60K sentences mainly from Wikipedia for mimic2. I have summarized current problems that occur after the current word parser and packed into a text file Probleme.txt.
Unfortunately, I have not understood the function yet to solve the things alone. besides, i’m working on getting my generated sentences for mimic2. but I can help with my tool or support.
lots of random punctuation. I stripped most of that out. Also, the negative/greater than/less than would need to be transcribed to words if spoken. Same with the degrees centigrade, or if it’s just three point three C, then the degree mark can be dropped. The times also.
If it’s a spoken word it needs to be written out as the actual word and not as anything else. I think you’re going to need to sit down and manually fix a lot of this stuff, unfortunately. My biggest chunk of time training a voice was correctly tagging all the clips I would use. 10k of those, and they’d been run through deepspeech (or google cloud STT), and even after that about half STILL needed some manual editing to be right. I have listened to all of the clips I used by now, there’s about 15k potential more I could do but have no actual desire to sit through at this point.
I’m not concerned with this thread to learn that I have now collected 30k sentences that I will speak with the mimic recording tool. mimic will never be able to pronounce abbreviations. therefore these must be corrected before the issue. Mycroft tools included functions that I would like to improve. Due to the fact that I edited and adapted so many sentences, I got a feeling for what a deep learning voice can never pronounce.
sorry for my bad google translate englisch