I am trying to run the dialog demo of sphinx 4 pre aplha but it gives errors.
I am creating a live speech application.
I imported the project using maven and followed this guide on stack overflow: https://stackoverflow.com/a/25963020/2653162
The error says about issues regarding the 16 khz and channel being mono. So clearly its about the sampling stuff. And is also says about microphone.
I looked on how change the microphone settings to 16 khz and 16 bit but there is no such option in windows 7
:
The thing is that the HelloWorld and dialog demo worked fine in sphinx4 1.06 beta but after I tried the latest release it gives following errors:
Exception in thread "main" java.lang.IllegalStateException: javax.sound.sampled.LineUnavailableException: line with format PCM_SIGNED 16000.0 Hz, 16 bit, mono, 2 bytes/frame, little-endian not supported.
at edu.cmu.sphinx.api.Microphone.<init>(Microphone.java:38)
at edu.cmu.sphinx.api.SpeechSourceProvider.getMicrophone(SpeechSourceProvider.java:18)
at edu.cmu.sphinx.api.LiveSpeechRecognizer.<init>(LiveSpeechRecognizer.java:34)
at edu.cmu.sphinx.demo.dialog.Dialog.main(Dialog.java:145)
Caused by: javax.sound.sampled.LineUnavailableException: line with format PCM_SIGNED 16000.0 Hz, 16 bit, mono, 2 bytes/frame, little-endian not supported.
at com.sun.media.sound.DirectAudioDevice$DirectDL.implOpen(DirectAudioDevice.java:513)
at com.sun.media.sound.AbstractDataLine.open(AbstractDataLine.java:121)
at com.sun.media.sound.AbstractDataLine.open(AbstractDataLine.java:413)
at edu.cmu.sphinx.api.Microphone.<init>(Microphone.java:36)
... 3 more
Cant figure out what to do to resolve the issue.
If you modify SpeechSourceProvider
to return a constant microphone reference, it won't try to create multiple microphone references, which is the source of the issue.
public class SpeechSourceProvider {
private static final Microphone mic = new Microphone(16000, 16, true, false);
Microphone getMicrophone() {
return mic;
}
}
The problem here is that you don't want multiple threads trying to access a single resource, but for the demo, the recognizers are stopped and started as needed so that they aren't all competing for the microphone.
As Nickolay explains in the source forge forum (here) the microphone resource needs to be released by the recognizer currently using it for another recognizer to be able to use the microphone. While the API is being fixed, I made the following changes to certain classes in the sphinx API as a temporary workaround. This is probably not the best solution, guess until a better solution is proposed, this will work.
I created a class named MicrophoneExtention
with the same source code as the Microphone
class, and added the following methods:
public void closeLine(){ line.close(); }
Similarly a LiveSpeechRecognizerExtention
class with the source code of LiveSpeechRecognizer
class, and made the following changes:
private final MicroPhoneExtention microphone;
microphone =new MicrophoneExtention(16000, 16, true, false);
public void closeRecognitionLine(){ microphone.closeLine(); }
Finally I edited the main method of the DialogDemo
.
Configuration configuration = new Configuration();
configuration.setAcousticModelPath(ACOUSTIC_MODEL);
configuration.setDictionaryPath(DICTIONARY_PATH);
configuration.setGrammarPath(GRAMMAR_PATH);
configuration.setUseGrammar(true);
configuration.setGrammarName("dialog");
LiveSpeechRecognizerExtention recognizer =
new LiveSpeechRecognizerExtention(configuration);
Recognizer.startRecognition(true);
while (true) {
System.out.println("Choose menu item:");
System.out.println("Example: go to the bank account");
System.out.println("Example: exit the program");
System.out.println("Example: weather forecast");
System.out.println("Example: digits\n");
String utterance = recognizer.getResult().getHypothesis();
if (utterance.startsWith("exit"))
break;
if (utterance.equals("digits")) {
recognizer.stopRecognition();
recognizer.closeRecognitionLine();
configuration.setGrammarName("digits.grxml");
recognizer=new LiveSpeechRecognizerExtention(configuration);
recognizeDigits(recognizer);
recognizer.closeRecognitionLine();
configuration.setGrammarName("dialog");
recognizer=new LiveSpeechRecognizerExtention(configuration);
recognizer.startRecognition(true);
}
if (utterance.equals("bank account")) {
recognizer.stopRecognition();
recognizerBankAccount(Recognizer);
recognizer.startRecognition(true);
}
if (utterance.endsWith("weather forecast")) {
recognizer.stopRecognition();
recognizer.closeRecognitionLine();
configuration.setUseGrammar(false);
configuration.setLanguageModelPath(LANGUAGE_MODEL);
recognizer=new LiveSpeechRecognizerExtention(configuration);
recognizeWeather(recognizer);
recognizer.closeRecognitionLine();
configuration.setUseGrammar(true);
configuration.setGrammarName("dialog");
recognizer=new LiveSpeechRecognizerExtention(configuration);
recognizer.startRecognition(true);
}
}
Recognizer.stopRecognition();
and obviously the method signatures in the DialogDemo
needs changing...
hope this helps...
and on a final note, I am not sure if what I did is exactly legal to start with. If i am doing something wrong, please be kind enough to point out my mistakes :D
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With