Pandas: Reading Excel with merged cells

Tags:

I have Excel files with multiple sheets, each of which looks a little like this (but much longer):

        Sample  CD4     CD8 Day 1   8311    17.3    6.44         8312    13.6    3.50         8321    19.8    5.88         8322    13.5    4.09 Day 2   8311    16.0    4.92         8312    5.67    2.28         8321    13.0    4.34         8322    10.6    1.95

The first column is actually four cells merged vertically.

When I read this using pandas.read_excel, I get a DataFrame that looks like this:

       Sample    CD4   CD8 Day 1    8311  17.30  6.44 NaN      8312  13.60  3.50 NaN      8321  19.80  5.88 NaN      8322  13.50  4.09 Day 2    8311  16.00  4.92 NaN      8312   5.67  2.28 NaN      8321  13.00  4.34 NaN      8322  10.60  1.95

How can I either get Pandas to understand merged cells, or quickly and easily remove the NaN and group by the appropriate value? (One approach would be to reset the index, step through to find the values and replace NaNs with values, pass in the list of days, then set the index to the column. But it seems like there should be a simpler approach.)

522

asked Apr 08 '14 12:04

iayork

2 Answers

You could use the Series.fillna method to forword-fill in the NaN values:

df.index = pd.Series(df.index).fillna(method='ffill')

For example,

In [42]: df Out[42]:         Sample    CD4   CD8 Day 1    8311  17.30  6.44 NaN      8312  13.60  3.50 NaN      8321  19.80  5.88 NaN      8322  13.50  4.09 Day 2    8311  16.00  4.92 NaN      8312   5.67  2.28 NaN      8321  13.00  4.34 NaN      8322  10.60  1.95  [8 rows x 3 columns]  In [43]: df.index = pd.Series(df.index).fillna(method='ffill')  In [44]: df Out[44]:         Sample    CD4   CD8 Day 1    8311  17.30  6.44 Day 1    8312  13.60  3.50 Day 1    8321  19.80  5.88 Day 1    8322  13.50  4.09 Day 2    8311  16.00  4.92 Day 2    8312   5.67  2.28 Day 2    8321  13.00  4.34 Day 2    8322  10.60  1.95  [8 rows x 3 columns]

126

answered Sep 21 '22 15:09

unutbu

df = df.fillna(method='ffill', axis=0)  # resolved updating the missing row entries

answered Sep 20 '22 15:09

Muth

Related questions
                            
                                Function not changing global variable
                            
                                Finding the indices of matching elements in list in Python
                            
                                Passing list-likes to .loc or [] with any missing labels is no longer supported
                            
                                how to test if one python module has been imported?
                            
                                Zipping lists of unequal size
                            
                                matplotlib imshow - default colour normalisation
                            
                                How Do I Use Raw Socket in Python?
                            
                                What's the reverse of shlex.split?
                            
                                How can I convert surrogate pairs to normal string in Python?
                            
                                Check if geo-point is inside or outside of polygon
                            
                                Parsing datetime in Python..?
                            
                                How does sklearn.svm.svc's function predict_proba() work internally?
                            
                                How to 'turn off' blurry effect of imshow() in matplotlib?
                            
                                How to add different graphs (as an inset) in another python graph [duplicate]
                            
                                How to bootstrap installation of Python modules on Amazon EMR?
                            
                                VS Code starts debugging in integrated terminal instead of debug console
                            
                                Python import src modules when running tests
                            
                                Reset a column's MultiIndex levels
                            
                                Python threading multiple bash subprocesses?
                            
                                What is the difference between SymPy and Sage?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Pandas: Reading Excel with merged cells

Tags:

python

pandas

excel

iayork

People also ask

2 Answers

unutbu

Muth

Recent Activity

Donate For Us