Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to read merged excel cells with R

Tags:

r

excel

I received hundreds of Excel sheets containing merged cells. Sender insists on using Excel and merging cells - nothing I can do about that. How do I read these using R? For example, a simplified version of the problem area of the input sheet might look something like this, where the merged cells (B2,B3,C2,C3) contain the word "X". The number of merged cells and their location in the sheet (and the value of "X") changes from sheet to sheet, and there may be more than one set of merged cells in the same sheet. The sheets are not actually in tabular format, and they contain other empty cells. I have successfully looped through all the files, cleaned up the whole mess, reshaped the result and obtained a tidy dataset (1 sheet instead of 736 Excel workbooks). The problem is, my solution so far ignores the information in the merged cells.

    A   B   C   D
1   a   f   i   l
2   b   X       m
3   c           n
4   d   g   j   o
5   e   h   k   p

How can I read the Excel sheet into R so that the result looks like this, with the word "X"

    A   B   C   D
1   a   f   i   l
2   b   X   X   m
3   c   X   X   n
4   d   g   j   o
5   e   h   k   p
like image 999
Dave Stumped Avatar asked May 29 '16 12:05

Dave Stumped


People also ask

Does wrap text work on merged cells?

Answer:Select the merged cells that you wish to wrap text. Right-click and then select "Format Cells" from the popup menu. When the Format Cells window appears, select the Alignment tab. Check the "Wrap text" checkbox.


2 Answers

library(openxlsx)

data <- read.xlsx(xlsxFile = "Your path", fillMergedCells = TRUE, colNames = FALSE)

fillMergedCells = TRUE

Try this!

like image 119
MaazKhan47 Avatar answered Nov 03 '22 00:11

MaazKhan47


If a VBA/R hybrid suits your purposes, here is a VBA macro which will unmerge all cells in a worksheet, while simultaneously filling all cells in the unmerged region with the corresponding value:

Sub UnMerge(ws As Worksheet)
    Dim R As Range, c As Range
    Dim v As Variant
    For Each c In ws.UsedRange
        If c.MergeCells Then
            v = c.Value
            Set R = c.MergeArea
            R.UnMerge
            R.Value = v
        End If
    Next c
End Sub

A simple test to show how it is called:

Sub test()
    UnMerge Sheets(1)
End Sub

The sub UnMerged can be used as part of a larger program that e.g. iterates over all .xlsx files in a folder and all data-containing sheets in the files, unmerging them all and saving them as .csv files.

On Edit. Native VBA file handling is somewhat annoying. I tend to use the related scripting language VBScript if I need to iterate over multiple files. I'm not sure if your virtual Windows can handle VBScript. I would assume so since VBScript is a standard part of the Windows OS. If this is the case, see if the following works (after backing up the files just to be safe). Save the code as a simple text file with a .vbs extension in the folder that contains the Excel files that you want to modify. Then, simply click its icon. It will iterate over all .xlx and .xlsx files in the directory that contains the script and unmerge sheet 1 in each such file. I didn't test it extensively and it contains no error-handling, but I did test it on a folder with three Excel files which each contained multiple merged regions and it ran as expected on my Windows machine. I don't know if it will work on your Mac:

Option Explicit

Dim fso,fol,f,xl, wb, ws,ext,v,r,c

Set fso = WScript.CreateObject("Scripting.FileSystemObject")
Set xl = CreateObject("Excel.Application")
xl.DisplayAlerts = False
xl.ScreenUpdating = False
set fol = fso.GetFolder(fso.GetParentFolderName(WScript.ScriptFullName))

For Each f In fol.Files
    ext = LCase(fso.GetExtensionName(f.Name))
    If ext = "xls" Or ext = "xlsx" Then
        Set wb = xl.Workbooks.Open(f.Path)
        Set ws = wb.Sheets(1)
        For Each c In ws.UsedRange
            If c.MergeCells Then
                v = c.Value
                Set R = c.MergeArea
                R.UnMerge
                R.Value = v
            End If
        Next
        wb.Save
        wb.Close   
    End If
Next
like image 24
John Coleman Avatar answered Nov 03 '22 01:11

John Coleman