Setting up MaryTTS (with an open Server) in another lanuage

Looping · May 21, 2020, 6:38pm

Hey guys,

as I reached my 10 000 characters limit at IBM’s Watson I wanna change my TTS - to MaryTTS. I do not want to install an own server but use an open one.

I got that far that Mycroft speaks to me in German but pronounces the German words with a strong English accent. I set it up like this:

  {
  "max_allowed_core_version": 20.2,
  "listener": {
    "device_name": "USB Camera-B4.09.24.1: Audio",
    "wake_word": "hey jarvis"
  },
  "hotwords": {
    "hey jarvis": {
      "module": "pocketsphinx",
      "phonemes": "HH EY . JH AA R V AH S",
      "lang": "en-us",
      "threshold": 1e-25
    }
  },
  "lang": "de-de",

 "tts": {
    "marytts": {
      "url": "http://marytts.chrobi.me:59125",
        "lang": "de-de",   
        "voice": "dfki-pavoque-styles"
                },
    "module": "marytts"

  },
  "confirm_listening": true
}

So where is the mistake? I could not find any instructions how to use MaryTTS with a RPi4 so just did what I thought could make sense. Apparently not quite?

Best regards

Looping

PeterPan · November 20, 2021, 12:35pm

Kind of late to your problem, but i just had the same problem and found a solution.

I am running my own MaryTTS inside a docker container, so i could see the logs.
There you could see that `“dfki-pavoque-styles” is no acceptable choice for voice output.
It depends on which voices you installed into your MaryTTS. You should check your MaryTTS provider, what he is offering.
Anyway, dfki-pavoque-styles is wrong/missing, so i changed it to “dfki-pavoque-neutral-hsmm”

The second problem in your config is the “lang” value.
You should change it from “de-de” to “de”.

For me its working now.

If you dont have your own MaryTTS server running and cant see the logs you get a hint why your configuration is wrong or how it should look like by:

visiting http://mary.dfki.de:59125/
opening developer tools (if you dont know, google how to open them for your browser. its very easy)
go into the network tab
now put some text into the marytts interface and click on the “SPEAK” button.
in the network tab you will now see a new request.
click on it
now you can see on the right site what a correct request should look like to the server and what kind of parameters he is sending.

for example:
GET
http://mary.dfki.de:59125/process?INPUT_TYPE=TEXT&OUTPUT_TYPE=AUDIO&INPUT_TEXT=Bienvenue dans le monde de la synthèse de la parole! &OUTPUT_TEXT=&effect_Volume_selected=&effect_Volume_parameters=amount:2.0;&effect_Volume_default=Default&effect_Volume_help=Help&effect_TractScaler_selected=&effect_TractScaler_parameters=amount:1.5;&effect_TractScaler_default=Default&effect_TractScaler_help=Help&effect_F0Scale_selected=&effect_F0Scale_parameters=f0Scale:2.0;&effect_F0Scale_default=Default&effect_F0Scale_help=Help&effect_F0Add_selected=&effect_F0Add_parameters=f0Add:50.0;&effect_F0Add_default=Default&effect_F0Add_help=Help&effect_Rate_selected=&effect_Rate_parameters=durScale:1.5;&effect_Rate_default=Default&effect_Rate_help=Help&effect_Robot_selected=&effect_Robot_parameters=amount:100.0;&effect_Robot_default=Default&effect_Robot_help=Help&effect_Whisper_selected=&effect_Whisper_parameters=amount:100.0;&effect_Whisper_default=Default&effect_Whisper_help=Help&effect_Stadium_selected=&effect_Stadium_parameters=amount:100.0&effect_Stadium_default=Default&effect_Stadium_help=Help&effect_Chorus_selected=&effect_Chorus_parameters=delay1:466;amp1:0.54;delay2:600;amp2:-0.10;delay3:250;amp3:0.30&effect_Chorus_default=Default&effect_Chorus_help=Help&effect_FIRFilter_selected=&effect_FIRFilter_parameters=type:3;fc1:500.0;fc2:2000.0&effect_FIRFilter_default=Default&effect_FIRFilter_help=Help&effect_JetPilot_selected=&effect_JetPilot_parameters=&effect_JetPilot_default=Default&effect_JetPilot_help=Help&HELP_TEXT=&exampleTexts=&VOICE_SELECTIONS=upmc-pierre-hsmm fr male hmm&AUDIO_OUT=WAVE_FILE&LOCALE=fr&VOICE=upmc-pierre-hsmm&AUDIO=WAVE_FILE

Every key,value pair is seperated by a &.
For example these 2 key,value pairs:
VOICE=upmc-pierre-hsmm&AUDIO=WAVE_FILE

Voice=upmc-pierre-hsmm and AUDIO=WAVE_FILE

good luck