Hey guys,
as I reached my 10 000 characters limit at IBM’s Watson I wanna change my TTS - to MaryTTS. I do not want to install an own server but use an open one.
I got that far that Mycroft speaks to me in German but pronounces the German words with a strong English accent. I set it up like this:
{
"max_allowed_core_version": 20.2,
"listener": {
"device_name": "USB Camera-B4.09.24.1: Audio",
"wake_word": "hey jarvis"
},
"hotwords": {
"hey jarvis": {
"module": "pocketsphinx",
"phonemes": "HH EY . JH AA R V AH S",
"lang": "en-us",
"threshold": 1e-25
}
},
"lang": "de-de",
"tts": {
"marytts": {
"url": "http://marytts.chrobi.me:59125",
"lang": "de-de",
"voice": "dfki-pavoque-styles"
},
"module": "marytts"
},
"confirm_listening": true
}
So where is the mistake? I could not find any instructions how to use MaryTTS with a RPi4 so just did what I thought could make sense. Apparently not quite?
Best regards
Looping
Kind of late to your problem, but i just had the same problem and found a solution.
I am running my own MaryTTS inside a docker container, so i could see the logs.
There you could see that `“dfki-pavoque-styles” is no acceptable choice for voice output.
It depends on which voices you installed into your MaryTTS. You should check your MaryTTS provider, what he is offering.
Anyway, dfki-pavoque-styles is wrong/missing, so i changed it to “dfki-pavoque-neutral-hsmm”
The second problem in your config is the “lang” value.
You should change it from “de-de” to “de”.
For me its working now.
If you dont have your own MaryTTS server running and cant see the logs you get a hint why your configuration is wrong or how it should look like by:
- visiting http://mary.dfki.de:59125/
- opening developer tools (if you dont know, google how to open them for your browser. its very easy)
- go into the network tab
- now put some text into the marytts interface and click on the “SPEAK” button.
- in the network tab you will now see a new request.
- click on it
- now you can see on the right site what a correct request should look like to the server and what kind of parameters he is sending.
for example:
GET
http://mary.dfki.de:59125/process?INPUT_TYPE=TEXT&OUTPUT_TYPE=AUDIO&INPUT_TEXT=Bienvenue dans le monde de la synthèse de la parole! &OUTPUT_TEXT=&effect_Volume_selected=&effect_Volume_parameters=amount:2.0;&effect_Volume_default=Default&effect_Volume_help=Help&effect_TractScaler_selected=&effect_TractScaler_parameters=amount:1.5;&effect_TractScaler_default=Default&effect_TractScaler_help=Help&effect_F0Scale_selected=&effect_F0Scale_parameters=f0Scale:2.0;&effect_F0Scale_default=Default&effect_F0Scale_help=Help&effect_F0Add_selected=&effect_F0Add_parameters=f0Add:50.0;&effect_F0Add_default=Default&effect_F0Add_help=Help&effect_Rate_selected=&effect_Rate_parameters=durScale:1.5;&effect_Rate_default=Default&effect_Rate_help=Help&effect_Robot_selected=&effect_Robot_parameters=amount:100.0;&effect_Robot_default=Default&effect_Robot_help=Help&effect_Whisper_selected=&effect_Whisper_parameters=amount:100.0;&effect_Whisper_default=Default&effect_Whisper_help=Help&effect_Stadium_selected=&effect_Stadium_parameters=amount:100.0&effect_Stadium_default=Default&effect_Stadium_help=Help&effect_Chorus_selected=&effect_Chorus_parameters=delay1:466;amp1:0.54;delay2:600;amp2:-0.10;delay3:250;amp3:0.30&effect_Chorus_default=Default&effect_Chorus_help=Help&effect_FIRFilter_selected=&effect_FIRFilter_parameters=type:3;fc1:500.0;fc2:2000.0&effect_FIRFilter_default=Default&effect_FIRFilter_help=Help&effect_JetPilot_selected=&effect_JetPilot_parameters=&effect_JetPilot_default=Default&effect_JetPilot_help=Help&HELP_TEXT=&exampleTexts=&VOICE_SELECTIONS=upmc-pierre-hsmm fr male hmm&AUDIO_OUT=WAVE_FILE&LOCALE=fr&VOICE=upmc-pierre-hsmm&AUDIO=WAVE_FILE
Every key,value pair is seperated by a &.
For example these 2 key,value pairs:
VOICE=upmc-pierre-hsmm&AUDIO=WAVE_FILE
Voice=upmc-pierre-hsmm and AUDIO=WAVE_FILE
good luck
1 Like