Voice Creation Expertise?

Hey all, we’ve been playing around with MaryTTS and in the past I’ve used pico2wav to handle the text-to-speech (A.K.A. Mycroft talking). There are also remote services we’ve played with like Ivona. I’m curious if anyone in the forum has a TTS background? MaryTTS allows you to create new voices, but their existing voices sound quite synthetic. I’d like to create a set that is less synthetic and that we can run locally on the RPi in order to make it fast.

Any advice? Feedback?

I was under the impression that festival was the best and most flexible one out there, although its been a few years since I last dabbled with this stuff. It was certainly more capable than espeak but then espeak was the no frills, just works option. Although, saying that, I did have a lot of fun with espeak by sshing into my girlfriends computer from the other room and making it talk to her when she least expected it. Mwahahaha

I have extensive experience using TTS.

Currently, I have…

Neospeech: Paul, Kate, Julie, Bridget

Ivona: Amy, Brian, Emma, Eric, Geraint, Ivy, Jennifer, Joey, Kendra, Kimbery, Russell, Salli

I can vouch that some of the neospeech voices I have don’t sound so robotic.

Last time I looked at this, the MBROLA voices for Festival were better than the open ones bundled with it, although I’m sure you’ve already looked at that!

All tts voices are robotic and suck.

A few closed source voices are somewhat better than others, but honestly, Mycroft is named after a computer, and the the relationship to Sherlock ’ s smarter brother was acknowledged. A somewhat robotic voice is acceptable, and the Festival English RP voices are pretty advanced. If the default was an English RP voice, that could be easily supported and documented, and of course, the user could choose any Festival voice, even poor old Tom.