Why am I getting an empty row in my dataframe after using pandas apply?

Tags:

I'm fairly new to Python and Pandas and trying to figure out how to do a simple split-join-apply. The problem I am having is that I am getting an blank row at the top of all the dataframes I'm getting back from Pandas' apply function and I'm not sure why. Can anyone explain?

The following is a minimal example that demonstrates the problem, not my actual code:

sorbet = pd.DataFrame({
  'flavour': ['orange', 'orange', 'lemon', 'lemon'],
  'niceosity' : [4, 5, 7, 8]})

def calc_vals(df, target) :
    return pd.Series({'total' : df[target].count(), 'mean' : df[target].mean()})

sorbet_grouped = sorbet.groupby('flavour')
sorbet_vals = sorbet_grouped.apply(calc_vals, target='niceosity')

if I then do print(sorted_vals) I get this output:

         mean  total
flavour                 <--- Why are there spaces here?
lemon     7.5      2
orange    4.5      2

[2 rows x 2 columns]

Compare this with print(sorbet):

  flavour  niceosity     <--- Note how column names line up
0  orange          4
1  orange          5
2   lemon          7
3   lemon          8

[4 rows x 2 columns]

What is causing this discrepancy and how can I fix it?

560

asked Mar 27 '14 16:03

Jack Aidley

1 Answers

The groupby/apply operation returns is a new DataFrame, with a named index. The name corresponds to the column name by which the original DataFrame was grouped.

The name shows up above the index. If you reset it to None, then that row disappears:

In [155]: sorbet_vals.index.name = None

In [156]: sorbet_vals
Out[156]: 
        mean  total
lemon    7.5      2
orange   4.5      2

[2 rows x 2 columns]

Note that the name is useful -- I don't really recommend removing it. The name allows you to refer to that index by name rather than merely by number.

If you wish the index to be a column, use reset_index:

In [209]: sorbet_vals.reset_index(inplace=True); sorbet_vals
Out[209]: 
  flavour  mean  total
0   lemon   7.5      2
1  orange   4.5      2

[2 rows x 3 columns]

answered Sep 27 '22 20:09

unutbu

Related questions
                            
                                Python3 - urllib.request permission denied
                            
                                AttributeError: 'SelectQuery' object has no attribute 'is_active'
                            
                                How to install py.test?
                            
                                Python & Matplotlib: creating two subplots with different sizes [duplicate]
                            
                                Better fuzzy matching performance?
                            
                                How to interact with ssh using subprocess module
                            
                                How to find the index of an array within an array
                            
                                Python - are there other ways to apply a function and filter in a list comprehension?
                            
                                Django language change ignored, remains default
                            
                                Find matching key-value pairs of two dictionaries
                            
                                How to return an html file in bottle server?
                            
                                Removing or preventing duplicate template matches in OpenCV with Python
                            
                                Determine if thread has been started
                            
                                How to show data labels when you mouse over data
                            
                                Python: Fast and efficient way of writing large text file
                            
                                django-rest-framework: How Do I Serialize a Field That Already Contains JSON?
                            
                                How to call a specific Python function from a batch file?
                            
                                Square brackets next to an object - What's the notation called?
                            
                                Qt Designer QListWidget checkbox
                            
                                Listening for global key-combinations in python on Linux

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Why am I getting an empty row in my dataframe after using pandas apply?

Tags:

python

python-3.x

pandas

Jack Aidley

People also ask

1 Answers

unutbu

Recent Activity

Donate For Us