Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Numerical column in Excel gets converted as logical

When I try to import data into R studio from Excel, the number columns are getting imported as Logical in R.

Is there a way to import these columns as a number into R? The column in Excel has been formatted to number.

I am using read_excel to import the file.

The spreadsheet I am trying to import has 80 columns.

like image 624
Hisho Avatar asked Jun 20 '18 12:06

Hisho


3 Answers

I ran into the exact same problem, and due to confidentiality policies I can not provide the excel file. However, I found the solution in github provided by jennybc in here, in there she posted:

"But I'm guessing that you have lots of blank values at the top of this worksheet. Looks like this column is being guessed as logical, and anything that's neither NA nor zero is becoming TRUE. If my diagnosis is correct, you should either specify the column type you want (probably numeric in this case) or increase guess_max to something higher than the default of 1000."

As my files change a bit in format, I went for the guess_max suggestion, and this line solves the issue for me:

temp.data <- read_xlsx(filepath, sheet = 1, guess_max = 10000)
like image 51
lucia.hd Avatar answered Sep 29 '22 15:09

lucia.hd


use col_types and explicitly specify the column types.

read_excel(path, sheet = 1, col_names = TRUE, col_types = c("text","numeric","date"), na = "", skip = 0)

https://www.rdocumentation.org/packages/readxl/versions/0.1.1/topics/read_excel

like image 31
SatZ Avatar answered Sep 29 '22 14:09

SatZ


As you did not provide a dataset as an example, I came up with the following dataset:

df <- structure(list(`1_a` = c(1212, 1221, 32432), `2_a` = c(121, 123, 3), `3_a` = c(34, 343, 232), 
                 `4_a` = c(65, 23, 123), `5_a` = c(34, 432, 1)), row.names = c(NA, -3L), 
            class = c("tbl_df", "tbl", "data.frame"))

The dataset are all numeric, with column names starting with a number.

Using the following code, I am able to read the excel file while retaining the column names as they are (test.xlsx being an example of the above dataset):

library(readxl)
df <- read_excel("test.xlsx", sheet = 1, col_names = TRUE)
like image 30
DTYK Avatar answered Sep 29 '22 15:09

DTYK