Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Linear regression in kdb

This is how I build linear regression but when I include more than two columns, my code does not work.

// Load relevant columns into memory
//
t:?[`data;enlist(=;`date;dat);0b;(ind,dep)!ind,dep];


X:9h$enlist[count[t]#1],t ind;
beta:inv[X mmu flip X] mmu X mmu 9h$t dep;

res:(`yInt,ind)!beta;

ind is independent fields names in symbols and dep is the dependent fields names in symbol. This part inv[X mmu flip X] does not not work. There will be a row that's blank if I include more than two fields for ind and I cannot do inv. I am not really sure how I can include more columns/fields for independent variables.

like image 797
Terry Avatar asked Mar 24 '26 15:03

Terry


1 Answers

With a minor modification, your code worked for me:

  / Generate some random data
  n:10;t:([]x1:n?1f;x2:n?1f;x3:n?1f)
  / Compute y = 1 + 2*x1 + 3*x2 + 4*x3 + small random noise 
  update y:1+(2*x1)+(3*x2)+(4*x3)+n?1e-3 from `t;
  / Define ind and dep
  ind:`x1`x2`x3
  dep:enlist`y
  / Build the design matrix
  X:enlist[count[t]#1f],t ind
  / Compute the regression parameters
  inv[X mmu flip X] mmu X mmu flip t dep
1.00052
1.99973
2.999979
4.000045

The only potentially tricky part is to remember to enlist and transpose dep.

From your description, it looks like you may have problem with your data. It either contains NaNs or your ind variables are not independent.

like image 145
Alexander Belopolsky Avatar answered Mar 26 '26 14:03

Alexander Belopolsky