Given two recorded voices in digital format, is there an algorithm to compare the two and return a coefficient of similarity?

I recommend to take a look into the HTK toolkit for speech recognition http://htk.eng.cam.ac.uk/, especially the part on feature extraction. Features that I would assume to be good indicators: <ul> <li>Mel-Cepstrum coefficients (general timbre)</li> <li>LPC (for the harmonics)</li> </ul>

Algorithm for voice comparison

2 Answers

I recommend to take a look into the HTK toolkit for speech recognition http://htk.eng.cam.ac.uk/, especially the part on feature extraction.

Features that I would assume to be good indicators:

Mel-Cepstrum coefficients (general timbre)
LPC (for the harmonics)

answered Sep 28 '22 04:09

Miquel Ramirez

Given your clarification I think what you are looking for falls under speech recognition algorithms.

Even though you are only looking for the measure of similarity and not trying to turn speech into text, still the concepts are the same and I would not be surprised if a large part of the algorithms would be quite useful.

However, you will have to define this coefficient of similarity more formally and precisely to get anywhere.

EDIT: I believe speech recognition algorithms would be useful because they do abstraction of the sound and comparison to some known forms. Conceptually this might not be that different from taking two recordings, abstracting them and comparing them.

From wikipedia article on HMM

"In speech recognition, the hidden Markov model would output a sequence of n-dimensional real-valued vectors (with n being a small integer, such as 10), outputting one of these every 10 milliseconds. The vectors would consist of cepstral coefficients, which are obtained by taking a Fourier transform of a short time window of speech and decorrelating the spectrum using a cosine transform, then taking the first (most significant) coefficients."

So if you run such an algorithm on both recordings you would end up with coefficients that represent the recordings and it might be far easier to measure and establish similarities between the two.

But again now you come to the question of defining the 'similarity coefficient' and introducing dogs and horses did not really help.

(Well it does a bit, but in terms of evaluating algorithms and choosing one over another, you will have to do better).

answered Sep 28 '22 06:09

Unreason

Related questions
                            
                                In C, How do I calculate the signed difference between two 48-bit unsigned integers?
                            
                                How do I interpret this declaration that appears to be a function declaration, but doesn't fit the usual mould?
                            
                                Cython: Compile a Standalone Static Executable
                            
                                Different signal handlers for parent and child
                            
                                Struct vs string literals? Read only vs read-write? [duplicate]
                            
                                Function Prefix vs "Function Struct" in C
                            
                                Why is returning a stack allocated pointer variable in a function allowed in C?
                            
                                Is there a way to declare a function argument to take an anonymous enum?
                            
                                Can integer division ever over/underflow, assuming the denominator <>0? [duplicate]
                            
                                Initializing struct containing arrays
                            
                                Some C Floating Point Constants Don't Make Sense
                            
                                Best practices for object oriented patterns with strict aliasing and strict alignment in C
                            
                                Win10 broke printf function
                            
                                Why is Python's requests 10x faster than C's libcurl?
                            
                                C to Python via SWIG: can't get void** parameters to hold their value
                            
                                Python C-API Object Allocation
                            
                                Successive calls to recvfrom() loses data?
                            
                                How do you set the order of libraries in automake?
                            
                                Comparing floats in their bit representations
                            
                                What is the use of declaring anonymous structures within a structure?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Algorithm for voice comparison

Tags:

c

algorithm

signal-processing

voice

ohho

People also ask

2 Answers

Miquel Ramirez

Unreason

Recent Activity

Donate For Us