Repeat rows in a pandas DataFrame based on column value

Tags:

I have the following df:

code . role    . persons 123 .  Janitor . 3 123 .  Analyst . 2 321 .  Vallet  . 2 321 .  Auditor . 5

The first line means that I have 3 persons with the role Janitors. My problem is that I would need to have one line for each person. My df should look like this:

df:  code . role    . persons 123 .  Janitor . 3 123 .  Janitor . 3 123 .  Janitor . 3 123 .  Analyst . 2 123 .  Analyst . 2 321 .  Vallet  . 2 321 .  Vallet  . 2 321 .  Auditor . 5 321 .  Auditor . 5 321 .  Auditor . 5 321 .  Auditor . 5 321 .  Auditor . 5

How could I do that using pandas?

792

asked Nov 16 '17 18:11

aabujamra

Video Answer

2 Answers

reindex+ repeat

df.reindex(df.index.repeat(df.persons)) Out[951]:     code  .     role ..1  persons 0   123  .  Janitor   .        3 0   123  .  Janitor   .        3 0   123  .  Janitor   .        3 1   123  .  Analyst   .        2 1   123  .  Analyst   .        2 2   321  .   Vallet   .        2 2   321  .   Vallet   .        2 3   321  .  Auditor   .        5 3   321  .  Auditor   .        5 3   321  .  Auditor   .        5 3   321  .  Auditor   .        5 3   321  .  Auditor   .        5

PS: you can add.reset_index(drop=True) to get the new index

151

answered Sep 23 '22 19:09

BENY

Wen's solution is really nice and intuitive. Here's an alternative, calling repeat on df.values.

df     code     role  persons 0   123  Janitor        3 1   123  Analyst        2 2   321   Vallet        2 3   321  Auditor        5   pd.DataFrame(df.values.repeat(df.persons, axis=0), columns=df.columns)     code     role persons 0   123  Janitor       3 1   123  Janitor       3 2   123  Janitor       3 3   123  Analyst       2 4   123  Analyst       2 5   321   Vallet       2 6   321   Vallet       2 7   321  Auditor       5 8   321  Auditor       5 9   321  Auditor       5 10  321  Auditor       5 11  321  Auditor       5

answered Sep 22 '22 19:09

cs95

Related questions
                            
                                KeyError when indexing Pandas dataframe
                            
                                Ceil and floor equivalent in Python 3 without Math module?
                            
                                Creating a temporary directory in PyTest
                            
                                Nesting 'WITH' statements in Python
                            
                                Different behavior between re.finditer and re.findall
                            
                                How can I make a deepcopy of a function in Python?
                            
                                Match a line with multiple regex using Python
                            
                                Find all upper, lower and mixed case combinations of a string
                            
                                What is the underscore prefix for python file name?
                            
                                Python class member lazy initialization
                            
                                Standalone colorbar (matplotlib)
                            
                                Error Pickling in Python: io.UnsupportedOperation: read
                            
                                XLRD/Python: Reading Excel file into dict with for-loops
                            
                                Python can't find module NLTK
                            
                                Losslessly compressing images on django
                            
                                Strip timezone info in pandas
                            
                                Changing pixel color value in PIL
                            
                                cqlsh connection error: 'ref() does not take keyword arguments'
                            
                                Pandas(Python) : Fill empty cells with with previous row value?
                            
                                select columns based on columns names containing a specific string in pandas

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Repeat rows in a pandas DataFrame based on column value

Tags:

python

pandas

dataframe

repeat