I'm looking for API to convert spoken items into text on iOS, but mainly for numbers and letters like 1, 2, 3, 4 and a, b, c, d.
I've tried OpenEars as many people suggested, but it appears to only support certain ords, such as "GO FORWARD BACKWARD LEFT RIGHT START STOP TURN". Can it be used to recognize generic words or spoken numbers?
I have also tried the iSpeech API, but when I speak a string of numbers like 12345, it can only return the text "one two three four five", and it can only give me the result of the recognition instead of a list of guesses (like the Google voice recognition API on Android).
How can I use either of these APIs (or another alternative) to recognize spoken numbers or letters?
To learn how to create custom language models and how to dynamically create language models with OpenEars (a language model is your custom set of words), read the OpenEars docs here:
http://www.politepix.com/openears/yourapp
To learn how to use an acoustic model with OpenEars which is oriented towards recognizing spoken digits, read this discussion in the OpenEars forum:
http://www.politepix.com/forums/topic/way-to-see-phonemes-openears-heard
You can also look at the code in the OpenEars sample app, which is heavily commented and shows an example of changing the apps "vocabulary" inline. If you have more questions about implementing OpenEars, I recommend making an account on the OpenEars forums (I'm the OpenEars developer).
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With