Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

read.xls - read in variable-length list of sheets, with their names

Given several .xls files with varying number of sheets, I am reading them into R usingread.xls from the gdata package. I have two related issues (solving the second issue should solve the first):

  1. It is unknown ahead of time how many sheets each .xls file will have, and in fact this value will vary from one file to the next.
  2. I need to capture the name of the sheet, which is relevant data

Right now, to resolve (1), I am using try() and iterating over sheet numbers until I hit an error.

How can I grab a list of the names of the sheet so that I can iterate over them?

like image 654
Ricardo Saporta Avatar asked Mar 28 '13 11:03

Ricardo Saporta


People also ask

How do I get a list of sheet names in Excel using Python?

Our aim is to get the names of these sheets through a Python Program. Step1: First Import the openpyxl library to the program. Step2: Load/Connect the Excel Workbook to the program. Step3: Use sheetnames property to get the names of all the sheets of the given workbook.

How do I read data from multiple sheets in Excel?

On the Data tab, under Tools, click Consolidate. In the Function box, click the function that you want Excel to use to consolidate the data. In each source sheet, select your data, and then click Add. The file path is entered in All references.


2 Answers

See the sheetCount and sheetNames functions (on same help page) in gdata. If xls <- "a.xls", say, then reading all sheets of a spreadsheet into a list, one sheet per component, is just this:

sapply(sheetNames(xls), read.xls, xls = xls, simplify = FALSE)

Note that the components will be named using the names of the sheets. Depending on the content it might make sense to remove simplify = FALSE.

like image 84
G. Grothendieck Avatar answered Sep 29 '22 12:09

G. Grothendieck


For such tasks I use library XLConnect. With its functions you can get the names of each sheet in a vector and then just determine the length of that vector.

#Read your workbook 
wb<-loadWorkbook("Your_workbook.xls")

#Save each sheet's name as a vector
lp<-getSheets(wb)

#Now read each sheet as separate list element
dat<-lapply(seq_along(lp),function(i) readWorksheet(wb,sheet=lp[i]))

UPDATE

As suggested by @Martin Studer XLConnect functions are already vectorized, so there is no need to use lapply(), instead just provide vector of sheet names or use function getSheets() inside readWorksheet().

dat <- readWorksheet(wb, sheet = getSheets(wb))
like image 38
Didzis Elferts Avatar answered Sep 29 '22 14:09

Didzis Elferts