Don’t know if that approach would be as fast as the current sound (well, perhaps the TTS engine is a bit slower than playing a wav file) as the main function is to be aware immediately if Mycroft has listened to you. It shouldn’t need either to query internet at all (or have the option), so it will sound as natural as speaking with a human.
As things stand, there’s a lot of work which appears to be necessary on echo suppression etc.: see my notes of a few days ago.
What /would/ probably be possible is to have context-specific tones rather than different phrases. However if a phrase is used I’m seeing a substantial amount of breakthrough between the wakeword acknowledgement and Mycroft’s attempt at interpreting my spoken command, and I must say that that appeared to get worse over the few days I was tinkering with it rather than better: is there any possibility that my experiments have “polluted” a voice recognition corpus somewhere?
If I understood correctly, you say Mycroft could think that what it’s speaking if it was my own speaking?
That would be a problem if we cannot make it aware and make it differentiate of what is its own voice from the environment voices.
Can you expand on the context-specific tones approach?
Correct as of the current release. I wouldn’t say that it sits there talking to itself (although I admit that I /have/ tried “Hey Mycroft, say hey Mycroft”) but it’s got progressively worse to the extent that the only way to get it to recognise anything is to talk loud while it’s still speaking it’s acknowledgement. See
Can you expand on the context-specific tones approach?
It would need coding, but basically just a brief “Bip” or similar as a marker.