Use R and Openxlsx to output a list of dataframes as worksheets in a single Excel file

Tags:

I have a set of CSV files. I want to package them up and export the data to a single Excel file that contains multiple worksheets. I read in the CSV files as a set of data frames.

My problem is how to construct the command in openxlsx, I can do it manually, but I am having a list construction issue. Specifically how to add a data frame as a subcomponent of a named list and then pass as a parameter to write.xlsx()

Example

Ok, so I first list the CSV files on disk and generate a set of data frames in memory...

# Generate a list of csv files on disk and shorten names... 
filePath <- "../02benchmark/results/results_20170330/"
filePattern <- "*.csv"
fileListwithPath = list.files(path = filePath, pattern = filePattern, full.names = TRUE)
fileList = list.files(path = filePath, pattern = filePattern, full.names = FALSE)

datasets <- gsub("*.csv$", "", fileList)
datasets <- gsub("sample_", "S", datasets)
datasets

# Now generate the dataframes for each csv file...
list2env(
  lapply(setNames(fileListwithPath, make.names(datasets)),
         read.csv), envir = .GlobalEnv)

Example Output:

dput(datasets)
c("S10000_R3.3.2_201703301839", "S10000_T4.3.0_201703301843", 
"S20000_R3.3.2_201703301826", "S20000_T4.3.0_201703301832", "S280000_R3.3.2_201704020847", 
"S280000_T4.3.0_201704021100", "S290000_R3.3.2_201704020447", 
"S290000_T4.3.0_201704020702", "S30000_R3.3.2_201703301803", 
"S30000_T4.3.0_201703301817", "S310000_R3.3.2_201704012331", 
"S310000_T4.3.0_201704020242", "S320000_R3.3.2_201704011827", 
"S320000_T4.3.0_201704012128", "S330000_R3.3.2_201704011304", 
"S330000_T4.3.0_201704011546", "S340000_R3.3.2_201704010652", 
"S340000_T4.3.0_201704011010", "S350000_R3.3.2_201704010020", 
"S350000_T4.3.0_201704010404", "S360000_R3.3.2_201703311819", 
"S360000_T4.3.0_201703312134", "S370000_R3.3.2_201703310914", 
"S370000_T4.3.0_201703311301", "S380000_R3.3.2_201703310134", 
"S380000_T4.3.0_201703310509", "S390000_R3.3.2_201703301846", 
"S390000_T4.3.0_201703302252", "S40000_R3.3.2_201703301738", 
"S40000_T4.3.0_201703301752", "S50000_R3.3.2_201703301707", "S50000_T4.3.0_201703301724", 
"S60000_R3.3.2_201703301624", "S60000_T4.3.0_201703301647", "S70000_R3.3.2_201703301535", 
"S70000_T4.3.0_201703301602", "S80000_R3.3.2_201703301430", "S80000_T4.3.0_201703301508", 
"S90000_R3.3.2_201703301324", "S90000_T4.3.0_201703301400")

Now we have a set of data frames and we wish to create a single excel file with multiple worksheets...

wb <- createWorkbook()
saveWorkbook(wb, 'output.xlsx')

lapply(names(myList), function(x) write.xlsx(myList[[x]], 'output.xlsx', sheetName=x, append=TRUE))

Problem:

The problem is I can create the list structure manually and can confirm it works BUT I cannot seem to construct the list automatically.

myList <- sapply(datasets,function(x) NULL)
names(myList)
str(myList)
myList$S10000_R3.3.2_201703301839 <- eval(S10000_R3.3.2_201703301839)

thus:

> str(myList)
List of 40
 $ S10000_R3.3.2_201703301839 :'data.frame':    43 obs. of  4 variables:
  ..$ function.: Factor w/ 42 levels "DF add random number vector",..: 30 25 38 42 36 39 40 29 26 22 ...
  ..$ user     : num [1:43] 2.144 0.263 0.024 0.068 0.008 ...
  ..$ system   : num [1:43] 0.63 0.065 0.001 0.004 0 ...
  ..$ elapsed  : num [1:43] 12.274 1.104 0.047 0.115 0.009 ...
 $ S10000_T4.3.0_201703301843 : NULL
 $ S20000_R3.3.2_201703301826 : NULL
 ...

Specific Problem: How to append each data frame to the list...

myList <- lapply( myList, function(x) eval(x) )

what am I doing wrong with lapply here? The above lapply() does not iterate through the list and append the data frame to the name list entry.

i.e. myList$S10000_R3.3.2_201703301839 <- eval(S10000_R3.3.2_201703301839)
> str(myList)
    List of 40
     $ S10000_R3.3.2_201703301839 :'data.frame':    43 obs. of  4 variables:
      ..$ function.: Factor w/ 42 levels "DF add random number vector",..: 30 25 38 42 36 39 40 29 26 22 ...
      ..$ user     : num [1:43] 2.144 0.263 0.024 0.068 0.008 ...
      ..$ system   : num [1:43] 0.63 0.065 0.001 0.004 0 ...
      ..$ elapsed  : num [1:43] 12.274 1.104 0.047 0.115 0.009 ...
     $ S10000_T4.3.0_201703301843 : NULL
     $ S20000_R3.3.2_201703301826 : NULL
     ...

What am I missing? All help gratefully appreciated. Yes, I am pretty certain I am missing something obvious... but... I am stumped.

348

asked Apr 03 '17 03:04

Technophobe01

2 Answers

I think it may be worth adding solution using imap function from the purrr package as it offers a convenient mechanism to access name and index of the list element in one call:

imap_xxx(x, ...), an indexed map, is short hand for map2(x, names(x), ...) if x has names, or map2(x, seq_along(x), ...) if it does not. This is useful if you need to compute on both the value and the position of an element.

`imap` solution

On dummy data for reproducibility.

lst_data <- list(cars = mtcars, air = airmiles)
wb <- openxlsx::createWorkbook()
purrr::imap(
    .x = lst_data,
    .f = function(df, object_name) {
        openxlsx::addWorksheet(wb = wb, sheetName = object_name)
        openxlsx::writeData(wb = wb, sheet = object_name, x = df)
    }
)
t_file <- tempfile(pattern = "test_df_export", fileext = ".xlsx")
saveWorkbook(wb = wb, file = t_file)

168

answered Oct 01 '22 14:10

Konrad

Here's the solution with openxlsx:

## create data;
dataframes <- split(iris, iris$Species)

# create workbook
wb <- createWorkbook()
 
#Iterate the same way as PavoDive, slightly different (creating an anonymous function inside Map())
Map(function(data, nameofsheet){     

    addWorksheet(wb, nameofsheet)
    writeData(wb, nameofsheet, data)
 
}, dataframes, names(dataframes))
         
## Save workbook to excel file 
saveWorkbook(wb, file = "file.xlsx", overwrite = TRUE)

.. however, openxlsx is also able to use it's function openxlsx::write.xlsx for this, so you can just give the object with your list of dataframes and the filepath, and openxlsx is smart enough to create the list as sheets within the xlsx-file. The code I post here with Map() is if you want to format the sheets in a specific way.

answered Oct 01 '22 13:10

emilBeBri

Related questions
                            
                                characters converted to dates when using write.csv
                            
                                Data object not found when deploying shiny app
                            
                                Logistic regression when response is a proportion (using JAGS)
                            
                                Fast string matching in R
                            
                                Fastest method to replace data values conditionally in data.table (speed comparison)
                            
                                Download URL links using R
                            
                                Text editor support for rmarkdown documents [closed]
                            
                                Scraping password protected forum in r
                            
                                How can I allow new R users to send information to a Google Form?
                            
                                Ratio of polynomials approximation
                            
                                Reading Rdata file with different encoding
                            
                                Return indices of rows whose elements (columns) all match a reference vector
                            
                                How can I vary opacity in a plotly R chart
                            
                                R corrplot change data labels
                            
                                R: programmatically create a function call
                            
                                Removing elements in a nested R list by name
                            
                                Fast calculation of CDF / rolling join on multiple columns
                            
                                How to shut down an R parallel cluster without the cluster variable?
                            
                                Where to create package environment variables?
                            
                                Setting x-axis limits for datetime in ggplot

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Use R and Openxlsx to output a list of dataframes as worksheets in a single Excel file

Tags:

r

excel

lapply

xlsx

openxlsx

Example

Now we have a set of data frames and we wish to create a single excel file with multiple worksheets...

Problem:

Specific Problem: How to append each data frame to the list...

Technophobe01

People also ask

2 Answers

`imap` solution

Konrad

emilBeBri

Recent Activity

Donate For Us

Use R and Openxlsx to output a list of dataframes as worksheets in a single Excel file

Tags:

r

excel

lapply

xlsx

openxlsx

Example

Now we have a set of data frames and we wish to create a single excel file with multiple worksheets...

Problem:

Specific Problem: How to append each data frame to the list...

Technophobe01

People also ask

2 Answers

imap solution

Konrad

emilBeBri

Related questions

Recent Activity

Donate For Us

`imap` solution