I am developing a website in PHP for illiterate people, to teach language alphabetically. At the end I'll create an assessment phase in which learners have to speak aloud through microphone the shown alphabet letter. When the learner pronounces the word I want to compare it with the saved alphabetic pronunciation.
Is it possible to do voice comparison with PHP?
Yes it is possible. Here is a little research and resources to get you started. Seems like you have your work cut out for you.
PHP Voice (formerly known as PHP VXML) contain four classes that assist in developing voice application using PHP. It supports Speech Synthesis Markup Language 1.0, Speech Recognition Grammar Specification 1.0, Voice Browser Call Control: CCXML 1.0, and Voice Extensible Markup Language (VoiceXML) 2.0.
In simple terms, it’s the same old PHP which now enables you to create voice applications.
It’s not an extension to PHP; infact it’s the same PHP which now outputs voice instead of text and also takes input as voice instead of text. In technical terms, it’s PHP whose standard text based input & output (stdio, stdout in programmer’s term) are replaced by voice equivalent.
AQuA is a simple but powerful tool to provide perceptual voice quality testing and audio file comparison in terms of audio quality. This is the easiest way to compare two audio files and test voice quality between original and degraded files.
From wikipedia: A vocoder is an analysis/synthesis system, used to reproduce human speech. In the encoder, the input is passed through a multiband filter, each band is passed through an envelope follower, and the control signals from the envelope followers are communicated to the decoder. The decoder applies these (amplitude) control signals to corresponding filters in the synthesizer. Since the control signals change only slowly compared to the original speech waveform, the bandwidth required to transmit speech can be reduced. This allows more speech channels to share a radio circuit or submarine cable. By encoding the control signals, voice transmission can be secured against interception.
The MASLE project has the goal of creating a series of tools for the evaluation of spoken language over the internet.This evaluation will be performed by automatic speech recognition software as well as by human raters.
NanoGong is an applet that can be used by someone to record, playback and save their voice, in a web page. When the recording is played back the user can speed up or slow down the sound without changing it. The applet can be used on a web page or as an integrated component in Moodle.
It's definitely possible, but there are a lot of things to take into consideration.
This sort of thing is going to have a very long and difficult workflow, with lots of complicated client and server-side code. I don't want to be too blunt, but if you need to ask "is it possible?", you probably can't do it. I myself probably wouldn't be trying something like this without consulting somebody more experienced than me. You need somebody that's had at least a few years experience with big client-side and server-side systems.
Oh, and this may just be personal preference, but I'd much prefer to be doing something like this using a Java EE server than PHP. I prefer PHP for smaller, easier server-side stuff.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With