I am trying to import a large xlsx file into R that has many sheets of data. I was attempting to do this through XLConnect
, but java memory problems (such as those described in this thread have prevented this technique from being successful.)
Instead, I am trying to use the openxlsx
package, which I have read works much faster and avoids Java altogether. But is there a way to use its read.xlsx
function within a loop to read in all of the sheets into separate dataframes? The technique I was using with the other package is no longer valid bc commands like loadWorkbook()
and getSheets()
can no longer be used.
Thank you for your help.
The read. xlsx method is used to read data from an Excel file or Workbook object into an R data. frame object over the specified file path. In this case, the lapply() method takes as input the sheet names and returns the corresponding data frames belonging to each sheet of the workbook.
I think the getSheetNames()
function is the right function to use. It will give you a vector of the worksheet names in a file. Then you can loop over this list to read in a list of data.frames.
read_all_sheets = function(xlsxFile, ...) {
sheet_names = openxlsx::getSheetNames(xlsxFile)
sheet_list = as.list(rep(NA, length(sheet_names)))
names(sheet_list) = sheet_names
for (sn in sheet_names) {
sheet_list[[sn]] = openxlsx::read.xlsx(xlsxFile, sheet=sn, ...)
}
return(sheet_list)
}
read_all_sheets(myxlsxFile)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With