To make this FR a little more universal, I’d suggest adding support for the SSML <audio>
tag in Mimic3. This would allow including arbitrary pre-recorded audio within the TTS messages and could be used for prepending “alert” type sound snippets to TTS messages.
I’d love to switch to Mycroft/Mimic3 for my Home Assistant installation, but this is actually the one feature that keeps me from doing it. I currently still use PicoTTS, which allows me to do just that (add Enterprise-like sounds to messages) using SSML, plus some other features (like changing volume and language). The generated audio is than played on a bunch of synchronized “audio alert boxes” (Pi Zero’s within a 3W USB speaker module), using Logitech Media Server.
Here is an example of my bilingual (German/English) Home Assistant “say_greeting” script, using a bunch of variables previously set and some randomness. It looks rather complicated, but you’ll get the idea:
say_greeting:
alias: 'Say greeting'
description: 'Speak a greeting message via TTS'
fields:
entity_id:
description: 'Entity Id of the media player to use'
example: 'media_player.signalpi1'
audiofile:
description: 'Name of introductory audio file to play. MUST be WAV, pcm-s16le, mono, 16000 Hz.'
example: 'picotts-beep.wav'
message:
description: 'The name of the person to welcome (can be a template)'
example: "Matthias"
language:
description: 'The language to speak in (defaults to tts: setting)'
example: 'en-GB'
sequence:
# only alert if "Audio Alerts" is on
- condition: state
entity_id: 'input_boolean.audio_alerts'
state: 'on'
# play text message
- service: tts.picotts_say
data_template:
entity_id: "{{ entity_id|default(states('var.audio_alerts_media_player'),true) }}"
language: "{{ language | default(states('input_select.audio_language'), true) }}"
message: >
{% set language = language | default(states('input_select.audio_language'), true) %}
{% set lang = language[0:2] %}
{% if audiofile == '' %}
{% elif audiofile %}
<play file="{{ states('var.audio_alerts_base_path') }}/{{ lang }}/{{ audiofile }}"/>
{% else %}
<play file="{{ states('var.audio_alerts_base_path') }}/{{ lang }}/picotts-{{ ['beep','transporter','door']|random }}.wav"/>
{% endif %}
<volume level="60">
{% if lang == 'de' %}
{% set greeting = [
"Willkommen auf der Enterprais, {{ message }}!",
"Hallo {{ message }}, schön dich zu sehen!",
"Hai, {{ message }}!",
"Schön, dass du wieder hier bist, {{ message }}!",
"Ach, {{ message }}, schon WIEDER nur du!",
"Wer stört mich jetzt schon wieder? Ach, {{ message }}, du bist's.",
] | random | replace('{{ message }}', message) | replace('Matthias',['Käpten','Schäffe','Meister', 'Matze']|random)
%}
{{ greeting }}
{% else %}
{% set greeting = [
"Welcome aboard the starship Enterprise, {{ message }}!",
"Hey {{ message }}, good to see you!",
"Hi, {{ message }}!",
"Great to see you are back, {{ message }}!",
"Aww, {{ message }}, it's only you! AGAIN.",
"Who is it? Ah, {{ message }}, come in!",
] | random | replace('{{ message }}', message) | replace('Matthias',['Captain','Boss','Master','Matt']|random)
%}
{{ greeting }}
{% endif %}
</volume>
This is actually one of my simpler scripts—would just love to be able to use Mimic3 for that (using the official <audio>
tag instead of PicoTTS’s <play>
)!
The above script gets triggered in Home Assistant a few minutes after I arrive back home, and—depending on the language setting—greets me with a random greeting, prepended by a Star Trek TNG “beep”, “transporter”, or “door opening” sound.
Using the <audio>
(or, with PicoTTS, <play>
) tag has the advantage that the TTS engine will create only one WAV file that an external media player can easily play—including whatever audio you might have included.
One might have to store audio in a format compatible with the TTS-generated output (16kHz mono for PicoTTS, probably 22kHz mono for Mimic3), but that shouldn’t be a problem.