I have the following df:
code . role    . persons 123 .  Janitor . 3 123 .  Analyst . 2 321 .  Vallet  . 2 321 .  Auditor . 5   The first line means that I have 3 persons with the role Janitors. My problem is that I would need to have one line for each person. My df should look like this:
df:  code . role    . persons 123 .  Janitor . 3 123 .  Janitor . 3 123 .  Janitor . 3 123 .  Analyst . 2 123 .  Analyst . 2 321 .  Vallet  . 2 321 .  Vallet  . 2 321 .  Auditor . 5 321 .  Auditor . 5 321 .  Auditor . 5 321 .  Auditor . 5 321 .  Auditor . 5   How could I do that using pandas?
Use DataFrame. drop_duplicates() to Drop Duplicate and Keep First Rows. You can use DataFrame. drop_duplicates() without any arguments to drop rows with the same values on all columns.
The pandas. DataFrame. duplicated() method is used to find duplicate rows in a DataFrame. It returns a boolean series which identifies whether a row is duplicate or unique.
repeat(3) will create a list where each index value will be repeated 3 times and df. iloc[df. index. repeat(3),:] will help generate a dataframe with the rows as exactly returned by this list.
You can loop over a pandas dataframe, for each column row by row.
reindex+ repeat
df.reindex(df.index.repeat(df.persons)) Out[951]:     code  .     role ..1  persons 0   123  .  Janitor   .        3 0   123  .  Janitor   .        3 0   123  .  Janitor   .        3 1   123  .  Analyst   .        2 1   123  .  Analyst   .        2 2   321  .   Vallet   .        2 2   321  .   Vallet   .        2 3   321  .  Auditor   .        5 3   321  .  Auditor   .        5 3   321  .  Auditor   .        5 3   321  .  Auditor   .        5 3   321  .  Auditor   .        5   PS: you can add.reset_index(drop=True) to get the new index
Wen's solution is really nice and intuitive. Here's an alternative, calling repeat on df.values.
df     code     role  persons 0   123  Janitor        3 1   123  Analyst        2 2   321   Vallet        2 3   321  Auditor        5   pd.DataFrame(df.values.repeat(df.persons, axis=0), columns=df.columns)     code     role persons 0   123  Janitor       3 1   123  Janitor       3 2   123  Janitor       3 3   123  Analyst       2 4   123  Analyst       2 5   321   Vallet       2 6   321   Vallet       2 7   321  Auditor       5 8   321  Auditor       5 9   321  Auditor       5 10  321  Auditor       5 11  321  Auditor       5 
                        If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With