I have the following data frame my_df
:
Person event time
---------------------------------
John A 2017-10-11
John B 2017-10-12
John C 2017-10-14
John D 2017-10-15
Ann X 2017-09-01
Ann Y 2017-09-02
Dave M 2017-10-05
Dave N 2017-10-07
Dave Q 2017-10-20
I want to create a new column, which is the (event, time) pair. It should look like:
Person event time event_time
------------------------------------------------------
John A 2017-10-11 (A, 2017-10-11)
John B 2017-10-12 (B, 2017-10-12)
John C 2017-10-14 (C, 2017-10-14)
John D 2017-10-15 (D, 2017-10-15)
Ann X 2017-09-01 (X, 2017-09-01)
Ann Y 2017-09-02 (Y, 2017-09-02)
Dave M 2017-10-05 (M, 2017-10-05)
Dave N 2017-10-07 (N, 2017-10-07)
Dave Q 2017-10-20 (Q, 2017-10-20)
Here is my code:
my_df['event_time'] = my_df.apply(lambda row: (row['event'] , row['time']), axis=1)
But I got the following error:
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
/usr/local/lib/python3.4/dist-packages/pandas/core/internals.py in create_block_manager_from_arrays(arrays, names, axes)
4309 blocks = form_blocks(arrays, names, axes)
-> 4310 mgr = BlockManager(blocks, axes)
4311 mgr._consolidate_inplace()
/usr/local/lib/python3.4/dist-packages/pandas/core/internals.py in __init__(self, blocks, axes, do_integrity_check, fastpath)
2794 if do_integrity_check:
-> 2795 self._verify_integrity()
2796
/usr/local/lib/python3.4/dist-packages/pandas/core/internals.py in _verify_integrity(self)
3005 if block._verify_integrity and block.shape[1:] != mgr_shape[1:]:
-> 3006 construction_error(tot_items, block.shape[1:], self.axes)
3007 if len(self.items) != tot_items:
/usr/local/lib/python3.4/dist-packages/pandas/core/internals.py in construction_error(tot_items, block_shape, axes, e)
4279 raise ValueError("Shape of passed values is {0}, indices imply {1}".format(
-> 4280 passed, implied))
4281
ValueError: Shape of passed values is (128, 2), indices imply (128, 3)
Any idea what I did wrong in my code? Thanks!
You can use:
my_df['event_time'] = my_df[['event','time']].apply(tuple, axis=1)
Or:
my_df['event_time'] = tuple(zip(my_df['event'], my_df['time']))
Or:
my_df['event_time'] = [tuple(x) for x in my_df[['event','time']].values.tolist()]
All return:
print (my_df)
Person event time event_time
0 John A 2017-10-11 (A, 2017-10-11)
1 John B 2017-10-12 (B, 2017-10-12)
2 John C 2017-10-14 (C, 2017-10-14)
3 John D 2017-10-15 (D, 2017-10-15)
4 Ann X 2017-09-01 (X, 2017-09-01)
5 Ann Y 2017-09-02 (Y, 2017-09-02)
6 Dave M 2017-10-05 (M, 2017-10-05)
7 Dave N 2017-10-07 (N, 2017-10-07)
8 Dave Q 2017-10-20 (Q, 2017-10-20)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With