Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

read_excel 'expecting numeric' ..... and value is numeric

Tags:

r

excel

I didn't find an answer to this question, so hopefully this is the place to get some help on this.

I am reading in many Excel files contained in .zip files. Each .zip that I have has about 40 excel files that I want to read. I am trying to create a list of data frames, but encounter an error on reading some files based on file content.

This is the read statment, inside a for loop:

    library(readxl)
    df[[i]] <- read_excel(xls_lst[i],
                      skip = 4,
                      col_names = FALSE,
                      na = "n/a",
                      col_types = data_types)

data_types has these values : > data_types [1] "text" "numeric" "numeric" "numeric" "numeric" "numeric" "numeric"

which is correct for this file.

The read_excel statement works well on some files, but returns warning message on others :

In read_xlsx_(path, sheet, col_names = col_names, col_types = col_types,... :
[54, 7]: expecting numeric: got '9999.990000'

Well, the value '9999.99000' looks like a numeric to me. When I open the Excel file that creates this warning, the file indeed shows these values, and also shows that the column is formatted as text in Excel. When I change the column formatting to numeric, re-save the Excel sheet, then the data is read in correctly.

However, I have several hundreds of these files to read ... how can read_excel ignore the column format indicated by Excel, and instead use the col_type defintion that I supply in the calling statement ?

Thanks,

like image 976
bici.sancta Avatar asked Jun 10 '26 23:06

bici.sancta


1 Answers

I tried to build a toy example.

My xlsx file contains:

3   1
3   3
4   4
5   5
7   '999
6   3

Reading in it your way:

data_types<-c("numeric","numeric")
a<-read_excel("aa.xlsx",
       col_names = FALSE,
       na = "n/a",
       col_types = data_types 
       )
Warning message:
In read_xlsx_(path, sheet, col_names = col_names, col_types = col_types,  :
[5, 2]: expecting numeric: got '999'

Reading in everything as text

data_types<-c("text","text")
dat<-read_excel("aa.xlsx",
          col_names = FALSE,
          na = "n/a",
          col_types = data_types
          )

And using type.convert:

dat[]<-lapply(dat, type.convert)

works at least for this simple example.

*Edited:

There was a mistake in the code.

*Edit in response to comment:

Another toy example demonstrating how you could apply type.convert to your data:

#list of data frames
l<-list()
l[[1]]<-data.frame(matrix(rep(as.character(1:5),2), ncol = 2), stringsAsFactors = FALSE)
l<-rep(l,3)

#looping over your list to encode columns correctly:
for (i in 1: length(l)){ 
l[[i]][]<-lapply(l[[i]], type.convert)
}

There might be better solutions. But I think this should work.

like image 141
Alex Avatar answered Jun 13 '26 17:06

Alex