Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Similarity between two signals: looking for simple measure

I have 20 signals (time-courses) in group A and 20 signals in group B. I want to find a measure to show that group A is different from group B. For example, I ran xcorr for the signals within each group. But now I need to compare them somehow. I tried to take a maximal amplitude of each xcorr pair, which is sort a measure of maximal similarity. Then I compared all these values between two groups, but there was no difference. What else can I do? I can also compare frequency spectrum, but then I again do not know what frequency bin to take. Any suggestions / references are highly appreciated!

I have about 20 signals in each group. Those are my samples. I do not know a-prirori what might be the difference. Here I bring the 9 sample signals for each group, their auto-correlation and cross-correlation for a subset of signals (group 1 vs. group 1, group 2 vs. group 2, group 1 vs. group 2). I do not see any evident difference. I also do not understand how you propose to compare cross-correlations, what peaks should I take? All the signals were detrended and z-scored.

enter image description here

enter image description here

enter image description here

enter image description here

enter image description here

enter image description here

enter image description here

like image 755
user1597969 Avatar asked Dec 17 '13 21:12

user1597969


People also ask

How do you measure similarity between two signals?

Similarity in energy (or power if different lengths): Square the two signals and sum each (and divide by signal length for power). (Since the signals were detrended, this should be signal variance.) Then subtract and take absolute value for a measure of signal variance similarity.

What is correlation between two signals?

Correlation of two signals is the convolution between one signal with the functional inverse version of the other signal. The resultant signal is called the cross-correlation of the two input signals. The amplitude of cross-correlation signal is a measure of how much the received signal resembles the target signal.

What is cross-correlation in signals and systems?

In signal processing, cross-correlation is a measure of similarity of two series as a function of the displacement of one relative to the other. This is also known as a sliding dot product or sliding inner-product. It is commonly used for searching a long signal for a shorter, known feature.


2 Answers

Well, this may be too simplistic of an answer, and too complex of a measure, but maybe its worth something.

In order to compare signals, we really have to establish some criterion by which we compare them. This could be so many things. If we want signals that look visually similar, we perform time domain analysis. If we are talking about audio signals that sound similar, we care about frequency or time-frequency analysis. If the signals are supposed to represent noise, then signal variance should be a good measure. In general we may want to use a combination of all sorts of measures. We can do this with a weighted index.

First let's establish what we have: there are two sets of signals: set A and set B. We want some measure that shows set A is different from set B. The signals are detrended.

We take signal a in A and signal b in B. The list of things we can compare:

  • Similarity in time domain (static): Multiply in place and sum.

  • Similarity in time domain (with shift*): Take fft of each signal, multiply, and ifft. (I believe this equivalent to matlab's xcorr.)

  • Similarity in frequency domain (static**): Take fft of each signal, multiply, and sum.

  • Similarity in frequency domain (with shift*): Multiply the two signals and take fft. This will show if the signals share similar spectral shapes.

  • Similarity in energy (or power if different lengths): Square the two signals and sum each (and divide by signal length for power). (Since the signals were detrended, this should be signal variance.) Then subtract and take absolute value for a measure of signal variance similarity.

* (with shift) -- You could choose to sum over the entire correlation vector to measure total general correlation, you could choose to sum only values in the correlation vector that surpass a certain threshold value (as if you expect echoes of one signal in the other), or just take the maximum value from the correlation vector (where its index is the shift in the second signal that results in maximal correlation with the first signal). Also, if the amount of shift that it takes to reach maximal correlation is important (i.e. if signals are similar only if it takes relatively small shift to reach the point of maximal correlation), then you can incorporate a measure of the index displacement.

** (frequency domain similarity) -- You may want to mask part of the spectrum that you're not concerned with, for instance, if you only care about the more high frequency structures (fs/4 and up), you could do:

mask = zeros(1,n); mask(n/4):
freq_static = mean(fft(a) .* fft(b) .* mask);

Also, we may want to implement a circular correlation like so:

function c = circular_xcorr(a,b)
c = xcorr(a,b);
mid = length(c) / 2;
c = c(1:mid) + c(mid+1:end);
end

Finally, we choose the characteristics that are important or relevant, and create a weighted index. Example:

n = 100;
a = rand(1,n); b = rand(1,n);
time_corr_thresh = .8 * n; freq_corr_thresh = .6 * n;
time_static = max(a .* b);
time_shifted = circular_xcorr(a,b);    time_shifted = sum(time_shifted(time_shifted > time_corr_thresh));
freq_static = max(fft(a) .* fft(b));
freq_shifted = fft(a .* b);     freq_shifted = sum(freq_shifted(freq_shifted > freq_corr_thresh));
w1 = 0; w2 = 1; w2 = .7; w3 = 0;
index = w1 * time_static + w1 * time_shifted + w2 * freq_static + w3 * freq_shifted;

We compute this index for each pair of signals.

I hope that this outline of signal characterization helps. Comment if anything is unclear.

like image 100
Brian Avatar answered Oct 14 '22 01:10

Brian


With reference to Brian's answer above, I've written a Python Function to compute the similarity of time-series signal as below;

def compute_similarity(ref_rec,input_rec,weightage=[0.33,0.33,0.33]):
    ## Time domain similarity
    ref_time = np.correlate(ref_rec,ref_rec)    
    inp_time = np.correlate(ref_rec,input_rec)
    diff_time = abs(ref_time-inp_time)

    ## Freq domain similarity
    ref_freq = np.correlate(np.fft.fft(ref_rec),np.fft.fft(ref_rec)) 
    inp_freq = np.correlate(np.fft.fft(ref_rec),np.fft.fft(input_rec))
    diff_freq = abs(ref_freq-inp_freq)

    ## Power similarity
    ref_power = np.sum(ref_rec**2)
    inp_power = np.sum(input_rec**2)
    diff_power = abs(ref_power-inp_power)

    return float(weightage[0]*diff_time+weightage[1]*diff_freq+weightage[2]*diff_power)
like image 35
Ivan Avatar answered Oct 14 '22 01:10

Ivan