I have a dataframe (df) that looks like this:
+---------+-------+------------+----------+ | subject | pills |    date    | strength | +---------+-------+------------+----------+ |       1 |     4 | 10/10/2012 |      250 | |       1 |     4 | 10/11/2012 |      250 | |       1 |     2 | 10/12/2012 |      500 | |       2 |     1 | 1/6/2014   |     1000 | |       2 |     1 | 1/7/2014   |      250 | |       2 |     1 | 1/7/2014   |      500 | |       2 |     3 | 1/8/2014   |      250 | +---------+-------+------------+----------+   When I use reshape in R, I get what I want:
reshape(df, idvar = c("subject","date"), timevar = 'strength', direction = "wide")  +---------+------------+--------------+--------------+---------------+ | subject |    date    | strength.250 | strength.500 | strength.1000 | +---------+------------+--------------+--------------+---------------+ |       1 | 10/10/2012 | 4            | NA           | NA            | |       1 | 10/11/2012 | 4            | NA           | NA            | |       1 | 10/12/2012 | NA           | 2            | NA            | |       2 | 1/6/2014   | NA           | NA           | 1             | |       2 | 1/7/2014   | 1            | 1            | NA            | |       2 | 1/8/2014   | 3            | NA           | NA            | +---------+------------+--------------+--------------+---------------+   Using pandas:
df.pivot_table(df, index=['subject','date'],columns='strength')  +---------+------------+-------+----+-----+ |         |            | pills            | +---------+------------+-------+----+-----+ |         | strength   | 250   | 500| 1000| +---------+------------+-------+----+-----+ | subject | date       |       |    |     | +---------+------------+-------+----+-----+ | 1       | 10/10/2012 | 4     | NA | NA  | |         | 10/11/2012 | 4     | NA | NA  | |         | 10/12/2012 | NA    | 2  | NA  | +---------+------------+-------+----+-----+ | 2       | 1/6/2014   | NA    | NA | 1   | |         | 1/7/2014   | 1     | 1  | NA  | |         | 1/8/2014   | 3     | NA | NA  | +---------+------------+-------+----+-----+   How do I get exactly the same output as in R with pandas? I only want 1 header.
DataFrame - pivot_table() function The pivot_table() function is used to create a spreadsheet-style pivot table as a DataFrame. The levels in the pivot table will be stored in MultiIndex objects (hierarchical indexes) on the index and columns of the result DataFrame.
Remove All Duplicate Rows from Pandas DataFrame You can set 'keep=False' in the drop_duplicates() function to remove all the duplicate rows. For E.x, df. drop_duplicates(keep=False) .
Return a copy of the array collapsed into one dimension. Whether to flatten in C (row-major), Fortran (column-major) order, or preserve the C/Fortran ordering from a . The default is 'C'.
After pivoting, convert the dataframe to records and then back to dataframe:
flattened = pd.DataFrame(pivoted.to_records()) #   subject        date  ('pills', 250)  ('pills', 500)  ('pills', 1000) #0        1  10/10/2012             4.0             NaN              NaN #1        1  10/11/2012             4.0             NaN              NaN #2        1  10/12/2012             NaN             2.0              NaN #3        2    1/6/2014             NaN             NaN              1.0 #4        2    1/7/2014             1.0             1.0              NaN #5        2    1/8/2014             3.0             NaN              NaN   You can now "repair" the column names, if you want:
flattened.columns = [hdr.replace("('pills', ", "strength.").replace(")", "") \                      for hdr in flattened.columns] flattened #   subject        date  strength.250  strength.500  strength.1000 #0        1  10/10/2012           4.0           NaN            NaN #1        1  10/11/2012           4.0           NaN            NaN #2        1  10/12/2012           NaN           2.0            NaN #3        2    1/6/2014           NaN           NaN            1.0 #4        2    1/7/2014           1.0           1.0            NaN #5        2    1/8/2014           3.0           NaN            NaN   It's awkward, but it works.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With