I have a dataframe and would like to calculate the correlation (with Spearman, data is categorical and ranked) but only for a subset of columns. I tried with all, but R's cor() function only accepts numerical data (x must be numeric, says the error message), even if Spearman is used.
One brute approach is to delete the non-numerical columns from the dataframe. This is not as elegant, for speed I still don't want to calculate correlations between all columns.
I hope there is a way to simply say "calculate correlations for columns x, y, z". Column references could by number or by name. I suppose the flexible way to provide them would be through a vector.
Any suggestions are appreciated.
Correlation is basically a concept used to identify relationship between two numeric variables. It is not applicable for non numeric data. If you want to understand relationship between two non numeric data, you can use chi squared test of independence.
We can use select_if() function to get numeric columns by calling the function with the dataframe name and isnumeric() function that will check for numeric columns.
You can see the correlation between two columns of pandas DataFrame by using DataFrame. corr() function.
In this method to calculate the correlation between two variables, the user has to simply call the corr() function from the base R, passed with the required parameters which will be the name of the variables whose correlation is needed to be calculated and further this will be returning the correlation detail between ...
if you have a dataframe where some columns are numeric and some are other (character or factor) and you only want to do the correlations for the numeric columns, you could do the following:
set.seed(10) x = as.data.frame(matrix(rnorm(100), ncol = 10)) x$L1 = letters[1:10] x$L2 = letters[11:20] cor(x) Error in cor(x) : 'x' must be numeric
but
cor(x[sapply(x, is.numeric)]) V1 V2 V3 V4 V5 V6 V7 V1 1.00000000 0.3025766 -0.22473884 -0.72468776 0.18890578 0.14466161 0.05325308 V2 0.30257657 1.0000000 -0.27871430 -0.29075170 0.16095258 0.10538468 -0.15008158 V3 -0.22473884 -0.2787143 1.00000000 -0.22644156 0.07276013 -0.35725182 -0.05859479 V4 -0.72468776 -0.2907517 -0.22644156 1.00000000 -0.19305921 0.16948333 -0.01025698 V5 0.18890578 0.1609526 0.07276013 -0.19305921 1.00000000 0.07339531 -0.31837954 V6 0.14466161 0.1053847 -0.35725182 0.16948333 0.07339531 1.00000000 0.02514081 V7 0.05325308 -0.1500816 -0.05859479 -0.01025698 -0.31837954 0.02514081 1.00000000 V8 0.44705527 0.1698571 0.39970105 -0.42461411 0.63951574 0.23065830 -0.28967977 V9 0.21006372 -0.4418132 -0.18623823 -0.25272860 0.15921890 0.36182579 -0.18437981 V10 0.02326108 0.4618036 -0.25205899 -0.05117037 0.02408278 0.47630138 -0.38592733 V8 V9 V10 V1 0.447055266 0.210063724 0.02326108 V2 0.169857120 -0.441813231 0.46180357 V3 0.399701054 -0.186238233 -0.25205899 V4 -0.424614107 -0.252728595 -0.05117037 V5 0.639515737 0.159218895 0.02408278 V6 0.230658298 0.361825786 0.47630138 V7 -0.289679766 -0.184379813 -0.38592733 V8 1.000000000 0.001023392 0.11436143 V9 0.001023392 1.000000000 0.15301699 V10 0.114361431 0.153016985 1.00000000
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With