Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

correlation error: 'x' must be numeric

I have a XTS dataset that contains many stock closing prices called: dataset. I then wanted to find if their returns have any correlation via cor() , however I get an error message: Error in cor(RETS) : 'x' must be numeric.

Here is what i have done:

RETS <- CalculateReturns(dataset, method= c("log")) # Calculate returns Via PerformanceAnalytics
RETS<- na.locf(RETS) #Solves missing NAs by carrying forward last observation
RETS[is.na(RETS)] <- "0"  #I then fill the rest of the NAs by adding "0"

Here is a sample of RETS

    row.names   A.Close    AA.Close AADR.Close  AAIT.Close   AAL.Close
1   2013-01-01    0            0            0         0         0
2   2013-01-02  0.0035      0.0088      0.0044      -0.00842    0
3   2013-01-03  0.0195      0.0207     -0.002848    -0.00494    0
4   2013-01-06 -0.0072     -0.0174      0.0078      -0.00070    0
5   2013-01-07 -0.0080      0          -0.01106     -0.03353    0
6   2013-01-08  0.0266     -0.002200    0.006655     0.0160     0
7   2013-01-09  0.0073     -0.01218     0.007551     0.013620   0

Then I perform the correlation:

#Perform Correlation
cor(RETS) -> correl
Error in cor(RETS1) : 'x' must be numeric

#Tried using as.numeric
cor(as.numeric(RETS), as.numeric(RETS) -> correl

However the answer is "1". I also tried using the correlation function in psych but get the same error message.

like image 888
Jason Avatar asked Jun 06 '14 07:06

Jason


People also ask

What does X must be numeric mean?

default, 'x' must be numeric occurs when we pass a non-numerical column or vector to the hist function. If we have a non-numerical column in a data frame or a non-numerical vector and we want to create the histogram of that data can be created with the help of barplot and table function.

Why does R say X must be numeric?

The “x must be numeric error in r histogram” error message is a numeric data problem and not necessarily a decoding mistake. Now it can result from an R code input mistake if you created the dataset but if you obtained your numeric data set from an outside file, it may not result from a coding mistake.

What does a correlation matrix show?

A correlation matrix is simply a table which displays the correlation coefficients for different variables. The matrix depicts the correlation between all the possible pairs of values in a table. It is a powerful tool to summarize a large dataset and to identify and visualize patterns in the given data.


1 Answers

I'm adding @Roland's answer where to close out the question.

The problem is that using

RETS[is.na(RETS)] <- "0"

is turning all the data into characters since adding any character value to a numeric value automatically changes the data.types to a character. Thus when you go to take the correlation, there is no way to do that for character values. So if you simply do

RETS[is.na(RETS)] <- 0

instead, you should avoid the conversion problem.

Rather than setting your missing values to NA, you might also consider explicitly telling cor how to handle missing values For example

cor(RETS, use="pairwise.complete.obs")

will only calculate correlation between two variables for those pairs where both are not-NA. See the ?cor help page for all of the options.

like image 54
MrFlick Avatar answered Sep 23 '22 17:09

MrFlick