Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

R - Corrplot() correlation matrix has question marks in the grid. How to understand the matrix?

I am trying to create a correlation matrix of the variables from IMDB movie prediction dataset from kaggle. When I try to plot the correlation matrix I get the following question marks in the matrix.

Correlation matrix

All the variables are numeric. How do i understand the question marks?

numeric_col <- sapply(df, is.numeric)
movie_numeric <- df[, numeric_col]
Correlation <- cor(movie_numeric)
corrplot(Correlation)
like image 506
user1884763 Avatar asked Sep 20 '25 01:09

user1884763


1 Answers

Like @neilfws said in his comment - NA values are represented by question marks.

You can try to avoid having NA values by using only pairwise-complete observations when computing the correlation matrix:

Correlation <- cor(movie_numeric, use="pairwise.complete.obs")
like image 84
Karolis Koncevičius Avatar answered Sep 21 '25 20:09

Karolis Koncevičius