Setting up MaryTTS (with an open Server) in another lanuage

Hey guys,

as I reached my 10 000 characters limit at IBM’s Watson I wanna change my TTS - to MaryTTS. I do not want to install an own server but use an open one.

I got that far that Mycroft speaks to me in German but pronounces the German words with a strong English accent. I set it up like this:

  "max_allowed_core_version": 20.2,
  "listener": {
    "device_name": "USB Camera-B4.09.24.1: Audio",
    "wake_word": "hey jarvis"
  "hotwords": {
    "hey jarvis": {
      "module": "pocketsphinx",
      "phonemes": "HH EY . JH AA R V AH S",
      "lang": "en-us",
      "threshold": 1e-25
  "lang": "de-de",

 "tts": {
    "marytts": {
      "url": "",
        "lang": "de-de",   
        "voice": "dfki-pavoque-styles"
    "module": "marytts"

  "confirm_listening": true

So where is the mistake? I could not find any instructions how to use MaryTTS with a RPi4 so just did what I thought could make sense. Apparently not quite?

Best regards


Kind of late to your problem, but i just had the same problem and found a solution.

I am running my own MaryTTS inside a docker container, so i could see the logs.
There you could see that `“dfki-pavoque-styles” is no acceptable choice for voice output.
It depends on which voices you installed into your MaryTTS. You should check your MaryTTS provider, what he is offering.
Anyway, dfki-pavoque-styles is wrong/missing, so i changed it to “dfki-pavoque-neutral-hsmm”

The second problem in your config is the “lang” value.
You should change it from “de-de” to “de”.

For me its working now.

If you dont have your own MaryTTS server running and cant see the logs you get a hint why your configuration is wrong or how it should look like by:

  • visiting
  • opening developer tools (if you dont know, google how to open them for your browser. its very easy)
  • go into the network tab
  • now put some text into the marytts interface and click on the “SPEAK” button.
  • in the network tab you will now see a new request.
  • click on it
  • now you can see on the right site what a correct request should look like to the server and what kind of parameters he is sending.

for example:
GET dans le monde de la synthèse de la parole! &OUTPUT_TEXT=&effect_Volume_selected=&effect_Volume_parameters=amount:2.0;&effect_Volume_default=Default&effect_Volume_help=Help&effect_TractScaler_selected=&effect_TractScaler_parameters=amount:1.5;&effect_TractScaler_default=Default&effect_TractScaler_help=Help&effect_F0Scale_selected=&effect_F0Scale_parameters=f0Scale:2.0;&effect_F0Scale_default=Default&effect_F0Scale_help=Help&effect_F0Add_selected=&effect_F0Add_parameters=f0Add:50.0;&effect_F0Add_default=Default&effect_F0Add_help=Help&effect_Rate_selected=&effect_Rate_parameters=durScale:1.5;&effect_Rate_default=Default&effect_Rate_help=Help&effect_Robot_selected=&effect_Robot_parameters=amount:100.0;&effect_Robot_default=Default&effect_Robot_help=Help&effect_Whisper_selected=&effect_Whisper_parameters=amount:100.0;&effect_Whisper_default=Default&effect_Whisper_help=Help&effect_Stadium_selected=&effect_Stadium_parameters=amount:100.0&effect_Stadium_default=Default&effect_Stadium_help=Help&effect_Chorus_selected=&effect_Chorus_parameters=delay1:466;amp1:0.54;delay2:600;amp2:-0.10;delay3:250;amp3:0.30&effect_Chorus_default=Default&effect_Chorus_help=Help&effect_FIRFilter_selected=&effect_FIRFilter_parameters=type:3;fc1:500.0;fc2:2000.0&effect_FIRFilter_default=Default&effect_FIRFilter_help=Help&effect_JetPilot_selected=&effect_JetPilot_parameters=&effect_JetPilot_default=Default&effect_JetPilot_help=Help&HELP_TEXT=&exampleTexts=&VOICE_SELECTIONS=upmc-pierre-hsmm fr male hmm&AUDIO_OUT=WAVE_FILE&LOCALE=fr&VOICE=upmc-pierre-hsmm&AUDIO=WAVE_FILE

Every key,value pair is seperated by a &.
For example these 2 key,value pairs:

Voice=upmc-pierre-hsmm and AUDIO=WAVE_FILE

good luck

1 Like