I have a severe to profound deafness from a very early age but luckily I can speak like a normal person. Verbal communication has always been difficult for me due to my impaired speech recognition abilities even with lip-reading. I have gone through school and college by just reading boards, powerpoint slides, books and the internet. I am doing pretty much fine at my current software engineering job, but of late I feel that I must put some effort to make my situation better.
Subtitles are my lifesaver in this country to understand movies/shows on TV and I have only been enjoying this for the last 7 years (I am 31 now).
I strongly feel the need for the ability to see subtitles in real life whenever I talk to some person, even strangers. I want to develop an untrained speech to text converter, and as a start it does not even have to spell out exact words for me, only cues on syllables/phonetics will also be fine.
I have googled on this for a while, but most results are either text to speech or half-baked attempts on speech recognition to give voice commands to a computer. I would really like to get some pointers on how to start on this project. Specifically I need steps like how to deal with audio files and what kind of processing I have to do to get approx phonetics as fast as possible.
wav') as source: audio_text = r. listen(source) # recoginize_() method will throw a request error if the API is unreachable, hence using exception handling try: # using google speech recognition text = r. recognize_google(audio_text) print('Converting audio transcripts into text ...') print(text) except: print('Sorry..
You might want to look at CMU's Sphinx project which does speech to text in real time. They have some demos to try it out.
Have a look at the DSP guide, it's more about low-level stuff but techniques like Fourier transforms and filtering are of great importance to audio processing. Even if you don't start from scratch it can be good to appreciate the principles and applications.
That said, I bet that starting from scratch, one could create something that can tell apart a basic set of sounds with a few days' work...
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With