I need to combine multiple rows into a single row, and the original dataframes looks like:
IndividualID DayID TripID JourSequence TripPurpose
200100000001 1 1 1 3
200100000001 1 2 2 31
200100000001 1 3 3 23
200100000001 1 4 4 5
200100000009 1 55 1 3
200100000009 1 56 2 12
200100000009 1 57 3 4
200100000009 1 58 4 6
200100000009 1 59 5 19
200100000009 1 60 6 2
I was trying to build some sort of 'trip chain', so basically all the journey sequences and trip purposes of one individual on a single day should be in the same row...
Ideally I was trying to convert the table to something like this:
IndividualID DayID Seq1 TripPurp1 Seq2 TripPur2 Seq3 TripPurp3 Seq4 TripPur4
200100000001 1 1 3 2 31 3 23 4 5
200100000009 1 1 3 2 12 3 4 4 6
If this is not possible, then the following mode would also be fine:
IndividualID DayID TripPurposes
200100000001 1 3, 31, 23, 5
200100000009 1 3, 12, 4, 6
Is there any possible solutions? I was thinking on for loop/ while statement, but maybe that was not really a good idea. Thanks in advance!
To get your second output you just need to groupby and apply list:
df.groupby(['IndividualID', 'DayID'])['TripPurpose'].apply(list)
TripPurpose
IndividualID DayID
200100000001 1 [3, 31, 23, 5]
200100000009 1 [3, 12, 4, 6, 19, 2]
to get your first output you can do something like this (probably not the best approach):
df2 = pd.DataFrame(df.groupby(['IndividualID', 'DayID'])['TripPurpose'].apply(list))
trip = df2['TripPurpose'].apply(pd.Series).rename(columns = lambda x: 'TripPurpose'+ str(x+1))
df3 = pd.DataFrame(df.groupby(['IndividualID', 'DayID'])['JourSequence'].apply(list))
seq = df3['JourSequence'].apply(pd.Series).rename(columns = lambda x: 'seq'+ str(x+1))
pd.merge(trip,seq,on=['IndividualID','DayID'])
output is not sorted
You can try:
df_out = df.set_index(['IndividualID','DayID',df.groupby(['IndividualID','DayID']).cumcount()+1]).unstack().sort_index(level=1, axis=1)
df_out.columns = df_out.columns.map('{0[0]}_{0[1]}'.format)
df_out.reset_index()
Output:
IndividualID DayID JourSequence_1 TripID_1 TripPurpose_1 \
0 200100000001 1 1.0 1.0 3.0
1 200100000009 1 1.0 55.0 3.0
JourSequence_2 TripID_2 TripPurpose_2 JourSequence_3 TripID_3 \
0 2.0 2.0 31.0 3.0 3.0
1 2.0 56.0 12.0 3.0 57.0
TripPurpose_3 JourSequence_4 TripID_4 TripPurpose_4 JourSequence_5 \
0 23.0 4.0 4.0 5.0 NaN
1 4.0 4.0 58.0 6.0 5.0
TripID_5 TripPurpose_5 JourSequence_6 TripID_6 TripPurpose_6
0 NaN NaN NaN NaN NaN
1 59.0 19.0 6.0 60.0 2.0
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With