Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How do I use a lexicon with SpeechSynthesizer?

I'm performing some text-to-speech and I'd like to specify some special pronunciations in a lexicon file. I have ran MSDN's AddLexicon example verbatim, and it speaks the sentence but it does not use the given lexicon, something appears to be broken.

Here's the provided example:

using System;
using Microsoft.Speech.Synthesis;

namespace SampleSynthesis
{
  class Program
  {
    static void Main(string[] args)
    {

      // Initialize a new instance of the SpeechSynthesizer.
      using (SpeechSynthesizer synth = new SpeechSynthesizer())
      {

        // Configure the audio output. 
        synth.SetOutputToDefaultAudioDevice();

        PromptBuilder builder = new PromptBuilder();
        builder.AppendText("Gimme the whatchamacallit.");

        // Append the lexicon file.
        synth.AddLexicon(new Uri("c:\\test\\whatchamacallit.pls"), "application/pls+xml");

        // Speak the prompt and play back the output file.
        synth.Speak(builder);
      }

      Console.WriteLine();
      Console.WriteLine("Press any key to exit...");
      Console.ReadKey();
    }
  }
}

and lexicon file:

<lexicon version="1.0" 
      xmlns="http://www.w3.org/2005/01/pronunciation-lexicon"
      xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
      xsi:schemaLocation="http://www.w3.org/2005/01/pronunciation-lexicon 
        http://www.w3.org/TR/2007/CR-pronunciation-lexicon-20071212/pls.xsd"
      alphabet="x-microsoft-ups" xml:lang="en-US">


  <lexeme>
    <grapheme> whatchamacallit </grapheme>
    <phoneme> W S1 AX T CH AX M AX K S2 AA L IH T </phoneme>
  </lexeme>

</lexicon>

The console opens, the text is spoken, but the new pronunciation isn't used. I have of course saved the file to c:\test\whatchamacallit.pls as specified.

I've tried variations of the Uri and file location (e.g. @"C:\Temp\whatchamacallit.pls", @"file:///c:\test\whatchamacallit.pls"), absolute and relative paths, copying it into the build folder, etc.

I ran Process Monitor and the file is not accessed. If it were a directory/file permission problem (which it isn't) I would still see the access denied messages, however I log no reference at all except the occasional one from my text editor. I do see the file accessed when I try File.OpenRead.

Unfortunately there are no error messages when using a garbage Uri.

On further investigation I realized this example is from Microsoft.Speech.Synthesis, whereas I'm using System.Speech.Synthesis over here. However from what I can tell they are identical except for some additional info and examples and both point to the same specification. Could this still be the problem?

I verified the project is set to use the proper .NET Framework 4.

I compared the example from MSDN to examples from the referenced spec, as well as trying those outright but it hasn't helped. Considering the file doesn't seem to be accessed I'm not surprised.

(I am able to use PromptBuilder.AppendTextWithPronunciation just fine but it's a poor alternative for my use case.)

Is the example on MSDN broken? How do I use a lexicon with SpeechSynthesizer?

like image 574
Christopher Galpin Avatar asked Jul 17 '12 19:07

Christopher Galpin


People also ask

What is a speech Synthesiser used for?

Speech synthesis is the computer-generated simulation of human speech. It is used to translate written information into aural information where it is more convenient, especially for mobile applications such as voice-enabled e-mail and Unified messaging .

Is speech synthesizer an output device?

A speech synthesizer is a computerized voice that turns a written text into a speech. It is an output where a computer reads out the word loud in a simulated voice; it is often called text-to-speech. It is not only to have machines talk simply but also to make a sound like humans of different ages and gender.


1 Answers

After a lot of research and pitfalls I can assure you that your assumption is just plain wrong. For some reason System.Speech.Synthesis.SpeechSynthesizer.AddLexicon() adds the lexicon to an internal list, but doesn't use it at all. Seems like nobody tried using it before and this bug went unnoticed.

Microsoft.Speech.Synthesis.SpeechSynthesizer.AddLexicon() (which belongs to the Microsoft Speech SDK) on the other hand works as expected (it passes the lexicon on to the COM object which interprets it as advertised).

Please refer to this guide on how to install the SDK: http://msdn.microsoft.com/en-us/library/hh362873%28v=office.14%29.aspx

Notes:

  • people reported the 64-bit version to cause COM exceptions (because the library does not get installed correctly), I confirmed this on a 64bit Windows 7 machine
    • using the x86 version circumvents the problem
  • be sure to install the runtime before the SDK
  • be sure to also install a runtime language (as adviced on the linked page) as the SDK does not use the default system speech engine
like image 144
M.Stramm Avatar answered Oct 22 '22 08:10

M.Stramm