Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Kalman filters with four input parameters

I have been studying the operation of the Kalman filter for a couple of days now to improve the performance of my face detection program. From the information I have gathered I have put together a code. The code for the Kalman filter part is as follows.

int Kalman(int X,int faceWidth,int Y,int faceHeight, IplImage *img1){
CvRandState rng; 
const float T = 0.1;

// Initialize Kalman filter object, window, number generator, etc
cvRandInit( &rng, 0, 1, -1, CV_RAND_UNI );

//IplImage* img = cvCreateImage( cvSize(500,500), 8, 3 );
CvKalman* kalman = cvCreateKalman( 4, 4, 0  );

// Initializing with random guesses
// state x_k
CvMat* state = cvCreateMat( 4, 1, CV_32FC1 );
cvRandSetRange( &rng, 0, 0.1, 0 );
rng.disttype = CV_RAND_NORMAL;
cvRand( &rng, state );

// Process noise w_k
CvMat* process_noise = cvCreateMat( 4, 1, CV_32FC1 );

// Measurement z_k
CvMat* measurement = cvCreateMat( 4, 1, CV_32FC1 );
cvZero(measurement);

/* create matrix data */   
 const float A[] = {    
        1, 0, T, 0,   
        0, 1, 0, T,   
        0, 0, 1, 0,   
        0, 0, 0, 1   
    }; 

 const float H[] = {    
        1, 0, 0, 0,   
        0, 0, 0, 0,   
        0, 0, 1, 0,   
        0, 0, 0, 0   
    };

 //Didn't use this matrix in the end as it gave an error:'ambiguous call to overloaded function' 
/* const float P[] = {   
        pow(320,2), pow(320,2)/T, 0, 0,   
        pow(320,2)/T, pow(320,2)/pow(T,2), 0, 0,   
        0, 0, pow(240,2), pow(240,2)/T,   
        0, 0, pow(240,2)/T, pow(240,2)/pow(T,2)   
        }; */

 const float Q[] = {   
        pow(T,3)/3, pow(T,2)/2, 0, 0,   
        pow(T,2)/2, T, 0, 0,   
        0, 0, pow(T,3)/3, pow(T,2)/2,   
        0, 0, pow(T,2)/2, T   
        };   

 const float R[] = {   
        1, 0, 0, 0,   
        0, 0, 0, 0,   
        0, 0, 1, 0,   
        0, 0, 0, 0   
        };   

//Copy created matrices into kalman structure
memcpy( kalman->transition_matrix->data.fl, A, sizeof(A));   
memcpy( kalman->measurement_matrix->data.fl, H, sizeof(H));   
memcpy( kalman->process_noise_cov->data.fl, Q, sizeof(Q));   
//memcpy( kalman->error_cov_post->data.fl, P, sizeof(P));   
memcpy( kalman->measurement_noise_cov->data.fl, R, sizeof(R)); 

//Initialize other Kalman Filter parameters
//cvSetIdentity( kalman->measurement_matrix, cvRealScalar(1) );
//cvSetIdentity( kalman->process_noise_cov, cvRealScalar(1e-5) );
/*cvSetIdentity( kalman->measurement_noise_cov, cvRealScalar(1e-1) );*/
cvSetIdentity( kalman->error_cov_post, cvRealScalar(1e-5) );

/* choose initial state */  
kalman->state_post->data.fl[0]=X;   
kalman->state_post->data.fl[1]=faceWidth;   
kalman->state_post->data.fl[2]=Y;   
kalman->state_post->data.fl[3]=faceHeight;

//cvRand( &rng, kalman->state_post );

/* predict position of point  */
const CvMat* prediction=cvKalmanPredict(kalman,0);

//generate measurement (z_k)
cvRandSetRange( &rng, 0, sqrt(kalman->measurement_noise_cov->data.fl[0]), 0 );   
cvRand( &rng, measurement ); 
cvMatMulAdd( kalman->measurement_matrix, state, measurement, measurement );

//Draw rectangles in detected face location
cvRectangle( img1,
            cvPoint( kalman->state_post->data.fl[0], kalman->state_post->data.fl[2] ),
            cvPoint( kalman->state_post->data.fl[1], kalman->state_post->data.fl[3] ),
            CV_RGB( 0, 255, 0 ), 1, 8, 0 );

cvRectangle( img1,
            cvPoint( prediction->data.fl[0], prediction->data.fl[2] ),
            cvPoint( prediction->data.fl[1], prediction->data.fl[3] ),
            CV_RGB( 0, 0, 255 ), 1, 8, 0 );

cvShowImage("Kalman",img1);

//adjust kalman filter state
cvKalmanCorrect(kalman,measurement);

cvMatMulAdd(kalman->transition_matrix, state, process_noise, state);

return 0;
}

In the face detection part(not shown), a box is drawn for the face detected. 'X, Y, faceWidth and faceHeight' are coordinates of the box and the width and the height passed into the Kalman filter. 'img1' is the current frame of a video.

Results:

Although I do get two new rectangles from the 'state_post' and 'prediction' data (as seen in the code), none of them seem to be any more stable than the initial box drawn without the Kalman filter.

Here are my questions:

  1. Are the matrices initialized (transition matrix A, measurement matrix H etc.), correct for this four input case? (eg.4*4 matrices for four inputs?)
  2. Can't we set every matrix to be an identity matrix?
  3. Is the method I followed till plotting the rectangles theoretically correct? I followed the examples in this and the book 'Learning OpenCV' which don't use external inputs.

Any help regarding this would be much appreciated!

like image 701
Kavo Avatar asked May 02 '12 08:05

Kavo


1 Answers

H[] should be identity if you measure directly from the image. If you have 4 measurements and you make 0 some values on the diagonal, you are making those expected measurements (x*H) 0, when it is not true. Then the innovation ( z- x*H) on the kalman filter will be high.

R[] should also be diagonal, though the covariance of the error of measurement can be different from one. If you have normalized coordinates ( the width=height=1), R could be something like 0.01. If you are dealing with pixel coordinates, R=diagonal_ones means an error of one pixel, and that's ok. You can try with 2,3,4, etc...

Your transition matrix A[], which is supposed to propagate the state on each frame, looks like a transition matrix for a state composed of x,y, v_x and v_y. You don't mention velocity in your model, you only talk about measurements. Be careful, do not confuse State (describes position of the face) with Measurements (used to update the state). Your state can be position, velocity and acceleration, and your measurements can be n points in the image. Or the x and y position of the face.

Hope this helps.

like image 163
Jav_Rock Avatar answered Nov 18 '22 03:11

Jav_Rock