After looking at some of services/tools, I've come to a conclusion. Most Text-to-Speech tools have too techy, robotic - in other words, bad quality c voices.
And yeah, on top of that, it looks like they come with a "hard-coded" voice templates, therefore shortening the variety/customization. Some tools allow you to set the reading speed and pitch', but that's not enough.
My guess about the problem behind the emotional aspect - it's hard to judge emotions from plain text, even more if it's just a sentence or two. Plus, the good ol' PC is a machine - machines don't have emotions, but that's a different story.
The thing that bothers me the most, is, quality. For example, there are these tools out there, that use to cut off apex of words, resulting in these techy voices. Feels like there's a problem with sentence construction or something. And yes, while people are working on such tools, I wonder, what keeps them from working a little more to improve those... cutting off the apex, that's not a small deal! Plus, have to keep in mind, that a good, quality Text-to-Speech software is worth, well... A LOT! Therefore resulting in a pretty profitable product.
Oh, under fluency I'm hiding questions, exclamations and so on. (Possible that those do not apply to fluency, but I'm not native English, please excuse me if that's the case.)
- Loquendo : lacks voice variety, got some minor apex/fluency problems (depends on sentence), too much coughing and excuses in examples!
- Nuance Vocalizer : while still lacks variety, some of the provided voices are worthy.
- eSpeak : one of the best robots out there, hence the program logo(?!)
- Natural Reader (dumb autoplay!!) : well, it got some fluency, but still that techy feeling kicks in.
- iSpeech : good laugh when setting the voice to Japanese with English text. I bet Japanese guys aren't very happy about it.
- Cepstral + Enhanced Voices ... plus the enhanced voices give the good ol' crappy result, so, except ~5 more voices, nothing have been enhanced.
- AT&T : decent fluency, but got problems with sentence endings and too much robo!
- LumenVox TTS : looks like coming from a background with lots of speech tools, but still results in robotic voices.
- And some more...
In case I've missed something worth a look, please share. Can be free, commercial, super expensive... as long as it works, I'm interested!
And the question(-s)..
I don't know if you're looking for an open solution, but if you have a Mac, you should check out OS X advanced speech markup and the "Repeat After Me" phrase building tool. It's really powerful. The Alex voice built into Mac OS X 10.5 and later is more advanced than the other voices.
On a Mac, highlight the following text, control-click, and go to Speech > Start Speaking:
You talkin' to me
[[inpt PHON]] [[slnc 500]] [[rate -30]]
+yUW _1tAOl=kIHn ~AX [[pbas +3]]+mIY?
http://www.mattmontag.com/personal/mac-os-x-speech-synthesis-markup
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With