I am trying to read merged cells of Excel with Python using xlrd. My Excel: (note that the first column is merged across the three rows) <pre class="prettyprint lang-none prettyprint-override"><code> A B C +---+---+----+ 1 | 2 | 0 | 30 | + +---+----+ 2 | | 1 | 20 | + +---+----+ 3 | | 5 | 52 | +---+---+----+ </code></pre> I would like to read the third line of the first column as equal to 2 in this example, but it returns <code>''</code>. Do you have any idea how to get to the value of the merged cell? My code: <pre class="prettyprint"><code>all_data = [[]] excel = xlrd.open_workbook(excel_dir+ excel_file) sheet_0 = excel.sheet_by_index(0) # Open the first tab for row_index in range(sheet_0.nrows): row= "" for col_index in range(sheet_0.ncols): value = sheet_0.cell(rowx=row_index,colx=col_index).value row += "{0} ".format(value) split_row = row.split() all_data.append(split_row) </code></pre> What I get: <pre class="prettyprint"><code>'2', '0', '30' '1', '20' '5', '52' </code></pre> What I would like to get: <pre class="prettyprint"><code>'2', '0', '30' '2', '1', '20' '2', '5', '52' </code></pre>

You can also try using fillna method available in pandas https://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.fillna.html <pre class="prettyprint"><code>df = pd.read_excel(dir+filename,header=1) df[ColName] = df[ColName].fillna(method='ffill') </code></pre> This should replace the cell's value with the previous value

Read merged cells in Excel with Python

Tags:

python

excel

cell

xlrd

I am trying to read merged cells of Excel with Python using xlrd.

My Excel: (note that the first column is merged across the three rows)

    A   B   C
  +---+---+----+
1 | 2 | 0 | 30 |
  +   +---+----+
2 |   | 1 | 20 |
  +   +---+----+
3 |   | 5 | 52 |
  +---+---+----+

I would like to read the third line of the first column as equal to 2 in this example, but it returns ''. Do you have any idea how to get to the value of the merged cell?

My code:

all_data = [[]]
excel = xlrd.open_workbook(excel_dir+ excel_file)
sheet_0 = excel.sheet_by_index(0) # Open the first tab

for row_index in range(sheet_0.nrows):
    row= ""
    for col_index in range(sheet_0.ncols):
        value = sheet_0.cell(rowx=row_index,colx=col_index).value             
        row += "{0} ".format(value)
        split_row = row.split()   
    all_data.append(split_row)

What I get:

'2', '0', '30'
'1', '20'
'5', '52'

What I would like to get:

'2', '0', '30'
'2', '1', '20'
'2', '5', '52'

761

asked Jun 09 '15 08:06

Antoine

2 Answers

You can also try using fillna method available in pandas https://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.fillna.html

df = pd.read_excel(dir+filename,header=1)
df[ColName] = df[ColName].fillna(method='ffill')

This should replace the cell's value with the previous value

160

answered Oct 06 '22 23:10

pprasad009

I just tried this and it seems to work for your sample data:

all_data = []
excel = xlrd.open_workbook(excel_dir+ excel_file)
sheet_0 = excel.sheet_by_index(0) # Open the first tab

prev_row = [None for i in range(sheet_0.ncols)]
for row_index in range(sheet_0.nrows):
    row= []
    for col_index in range(sheet_0.ncols):
        value = sheet_0.cell(rowx=row_index,colx=col_index).value
        if len(value) == 0:
            value = prev_row[col_index]
        row.append(value)
    prev_row = row
    all_data.append(row)

returning

[['2', '0', '30'], ['2', '1', '20'], ['2', '5', '52']]

It keeps track of the values from the previous row and uses them if the corresponding value from the current row is empty.

Note that the above code does not check if a given cell is actually part of a merged set of cells, so it could possibly duplicate previous values in cases where the cell should really be empty. Still, it might be of some help.

Additional information:

I subsequently found a documentation page that talks about a merged_cells attribute that one can use to determine the cells that are included in various ranges of merged cells. The documentation says that it is "New in version 0.6.1", but when i tried to use it with xlrd-0.9.3 as installed by pip I got the error

NotImplementedError: formatting_info=True not yet implemented

I'm not particularly inclined to start chasing down different versions of xlrd to test the merged_cells feature, but perhaps you might be interested in doing so if the above code is insufficient for your needs and you encounter the same error that I did with formatting_info=True.

answered Oct 07 '22 01:10

Gord Thompson

Related questions
                            
                                Python class method chaining
                            
                                using python WeakSet to enable a callback functionality
                            
                                Storing a dict with np.savez gives unexpected result?
                            
                                Using Pandas, how do I drop the last row of each group?
                            
                                ImportError: No module named gi.repository
                            
                                Reading back tuples from a csv file with pandas
                            
                                pow or ** for very large number in Python
                            
                                NetworkX largest component no longer working?
                            
                                Clustering geo location coordinates (lat,long pairs) using KMeans algorithm with Python
                            
                                how to aggregate elements of a list of tuples if the tuples have the same first element?
                            
                                MFCC feature descriptors for audio classification using librosa
                            
                                What's the difference between '_io' and 'io'?
                            
                                Python numpy subtraction no negative numbers (4-6 gives 254)
                            
                                How to stream twitter mentions with tweepy?
                            
                                Python project using protocol buffers, Deployment issues
                            
                                Show only errors with pylint and syntastic in vim
                            
                                BeautifulSoup find only elements where an attribute contains a sub-string? Is this possible?
                            
                                ImportError: No module named 'html.parser'; 'html' is not a package (python3) [duplicate]
                            
                                Creating transactions with with statements in psycopg2
                            
                                Matplotlib into a Django Template

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With