Apply function to pandas DataFrame that can return multiple rows

Tags:

I am trying to transform DataFrame, such that some of the rows will be replicated a given number of times. For example:

df = pd.DataFrame({'class': ['A', 'B', 'C'], 'count':[1,0,2]})

  class  count
0     A      1
1     B      0
2     C      2

should be transformed to:

  class 
0     A   
1     C   
2     C

This is the reverse of aggregation with count function. Is there an easy way to achieve it in pandas (without using for loops or list comprehensions)?

One possibility might be to allow DataFrame.applymap function return multiple rows (akin apply method of GroupBy). However, I do not think it is possible in pandas now.

779

asked Oct 24 '12 13:10

btel

1 Answers

You could use groupby:

def f(group):
    row = group.irow(0)
    return DataFrame({'class': [row['class']] * row['count']})
df.groupby('class', group_keys=False).apply(f)

so you get

In [25]: df.groupby('class', group_keys=False).apply(f)
Out[25]: 
  class
0     A
0     C
1     C

You can fix the index of the result however you like

177

answered Oct 03 '22 05:10

Wes McKinney

Related questions
                            
                                What is the alternative of numpy.newaxis in tensorflow?
                            
                                InvalidInstanceId: An error occurred (InvalidInstanceId) when calling the SendCommand operation
                            
                                cv2.imshow image window placement is outside of viewable screen
                            
                                TypeError: unsupported operand type(s) for +: 'PosixPath' and 'str'
                            
                                What does the return value of gc.collect() actually mean?
                            
                                Plotly: How to change figure size?
                            
                                recursive lambda-expressions possible?
                            
                                Eclipse+PyDev+GAE memcache "Undefined variable from import: get"
                            
                                Resident Set Size (RSS) limit has no effect
                            
                                howto uncompress gzipped data in a byte array?
                            
                                Relative imports in python 2.5
                            
                                Login to website using python
                            
                                Convert numbers to grades in python list
                            
                                Python - dealing with mixed-encoding files
                            
                                Python: two-curve gaussian fitting with non-linear least-squares
                            
                                Solving Puzzle in Python
                            
                                Running command lines within your Python script
                            
                                OpenCV 2.4.1 - computing SURF descriptors in Python
                            
                                Is there a C/C++ API for python pandas? [closed]
                            
                                SQLAlchemy introspect column type with inheritance

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Apply function to pandas DataFrame that can return multiple rows

Tags:

python

pandas

dataframe

btel

People also ask

1 Answers

Wes McKinney

Recent Activity

Donate For Us