I have a DataFrame in which one of the column has the list of values ( each values is a value of a feature). Now I need to convert those list of values into each column.
Ex: DataFrame is having two columns in which data column is list of values
data , Time [1,2,3,4], 12:34 [5,6,7,8], 12:36 [9,1,2,3], 12:45
I need to convert then as
Ex:
data0 data1 data2 data3 Time
1 , 2 , 3 , 4 , 12:34
5 , 6 , 7 , 8 , 12:36
9 , 1 , 2 , 3 , 12:45
How can I do this efficiently?
numpy
We get a very fast solution by using np.column_stack
directly on values. The only thing left to do is to stitch together the columns
v = np.column_stack([df.data.values.tolist(), df.Time.values])
c = ['data{}'.format(i) for i in range(v.shape[1] - 1)] + ['Time']
pd.DataFrame(v, df.index, c)
data0 data1 data2 data3 Time
0 1 2 3 4 12:34
1 5 6 7 8 12:36
2 9 1 2 3 12:45
timeit
%%timeit
pd.DataFrame(df['data'].values.tolist()).add_prefix('data').join(df['Time'])
1000 loops, best of 3: 1.13 ms per loop
%%timeit
v = np.column_stack([df.data.values.tolist(), df.Time.values])
c = ['data{}'.format(i) for i in range(v.shape[1] - 1)] + ['Time']
pd.DataFrame(v, df.index, c)
10000 loops, best of 3: 183 µs per loop
You can use DataFrame
constructor with converting column data
to numpy array
by values
+ tolist
, add_prefix
and last join
column Time
:
df = pd.DataFrame(df['data'].values.tolist()).add_prefix('data').join(df['Time'])
print (df)
data0 data1 data2 data3 Time
0 1 2 3 4 12:34
1 5 6 7 8 12:36
2 9 1 2 3 12:45
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With