Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Remove constant columns from an RDD and compute the covariance matrix

My RDD might have columns with constant value. In other words, the variance of some of the columns may be zero. My objective is to remove all such columns from the RDD (and ultimately compute the covariance matrix for the remaining columns). How can I do that?

Thanks and regards,

like image 472
learning_spark Avatar asked Oct 15 '25 08:10

learning_spark


1 Answers

An RDD is supposed to be immutable. So I don't think you want to remove something from it, but just map it to something that suits you and/or filter something out (more details in the documentation).

like image 152
Costi Ciudatu Avatar answered Oct 16 '25 22:10

Costi Ciudatu