Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

ICA - Statistical Independence & Eigenvalues of Covariance Matrix

I am currently creating different signals using Matlab, mixing them by multiplying them by a mixing matrix A, and then trying to get back the original signals using FastICA.

So far, the recovered signals are really bad when compared to the original ones, which was not what I expected.

I'm trying to see whether I'm doing anything wrong. The signals I'm generating are the following: (Amplitudes are in the range [0,1].)

s1 = (-x.^2 + 100*x + 500) / 3000; % quadratic
s2 = exp(-x / 10); % -ve exponential
s3 = (sin(x)+ 1) * 0.5; % sine
s4 = 0.5 + 0.1 * randn(size(x, 2), 1); % gaussian
s5 = (sawtooth(x, 0.75)+ 1) * 0.5; % sawtooth

Original Signals

One condition for ICA to be successful is that at most one signal is Gaussian, and I've observed this in my signal generation.

However, another condition is that all signals are statistically independent.

All I know is that this means that, given two signals A & B, knowing one signal does not give any information with regards to the other, i.e.: P(A|B) = P(A) where P is the probability.

Now my question is this: Are my signals statistically independent? Is there any way I can determine this? Perhaps some property that must be observed?

Another thing I've noticed is that when I calculate the eigenvalues of the covariance matrix (calculated for the matrix containing the mixed signals), the eigenspectrum seems to show that there is only one (main) principal component. What does this really mean? Shouldn't there be 5, since I have 5 (supposedly) independent signals?

For example, when using the following mixing matrix:

A =

0.2000    0.4267    0.2133    0.1067    0.0533
0.2909    0.2000    0.2909    0.1455    0.0727
0.1333    0.2667    0.2000    0.2667    0.1333
0.0727    0.1455    0.2909    0.2000    0.2909
0.0533    0.1067    0.2133    0.4267    0.2000

The eigenvalues are: 0.0000 0.0005 0.0022 0.0042 0.0345 (only 4!)

When using the identity matrix as the mixing matrix (i.e. the mixed signals are the same as the original ones), the eigenspectrum is: 0.0103 0.0199 0.0330 0.0811 0.1762. There still is one value much larger than the rest..

Thank you for your help.

I apologise if the answers to my questions are painfully obvious, but I'm really new to statistics, ICA and Matlab. Thanks again.

EDIT - I have 500 samples of each signal, in the range [0.2, 100], in steps of 0.2, i.e. x = 0:0.1:100.

EDIT - Given the ICA Model: X = As + n (I'm not adding any noise at the moment), but I am referring to the eigenspectrum of the transpose of X, i.e. eig(cov(X')).

like image 351
Rachel Avatar asked Nov 13 '22 09:11

Rachel


1 Answers

Your signals are correlated (not independent). Right off the bat, the sawtooth and the sine are the same period. Tell me the value of one I'll tell you the value of the other, perfect correlation.

If you change up the period of one of them that'll make them more independent.

Also S1 and S2 are kinda correlated.

As for the eigenvalues, first of all your signals are not independent (see above).

Second of all, your filter matrix A is also not well conditioned, spreading out your eigenvalues further.

Even if you were to pipe in five fully independent (iid, yada yada) signals the covariance would be:

E[ A y y' A' ] = E[ A I A' ]  =  A A'

The eigenvalues of that are:

eig(A*A')
ans =

   0.000167972216475
   0.025688510850262
   0.035666735304024
   0.148813869149738
   1.042451912479502

So you're really filtering/squishing all the signals down onto one basis function / degree of freedom and of course they'll be hard to recover, whatever method you use.

like image 124
Nate Avatar answered Jan 08 '23 18:01

Nate