I have a Pandas DataFrame column with multiple lists within a list. Something like this:
df
col1
0 [[1,2], [2,3]]
1 [[a,b], [4,5], [x,y]]
2 [[6,7]]
I want to split the list over multiple columns so the output should be something like:
col1 col2 col3
0 [1,2] [2,3]
1 [a,b] [4,5] [x,y]
2 [6,7]
Please help me with this. Thanks in advance
split() function is used to break up single column values into multiple columns based on a specified separator or delimiter. The Series. str. split() function is similar to the Python string split() method, but split() method works on the all Dataframe columns, whereas the Series.
split() Pandas provide a method to split string around a passed separator/delimiter. After that, the string can be stored as a list in a series or it can also be used to create multiple column data frames from a single separated string.
Pandas DataFrame: explode() functionThe explode() function is used to transform each element of a list-like to a row, replicating the index values. Exploded lists to rows of the subset columns; index will be duplicated for these rows. Raises: ValueError - if columns of the frame are not unique.
You can use pd.Series.apply
:
df = pd.DataFrame({'col1': [[[1, 2], [2, 3]],
[['a', 'b'], [4, 5], ['x', 'y']],
[[6, 7]]]})
res = df['col1'].apply(pd.Series)
print(res)
0 1 2
0 [1, 2] [2, 3] NaN
1 [a, b] [4, 5] [x, y]
2 [6, 7] NaN NaN
I think need DataFrame
contructor if performance is important:
df = pd.DataFrame(df['col1'].values.tolist())
print (df)
0 1 2
0 [1, 2] [2, 3] None
1 [a, b] [4, 5] [x, y]
2 [6, 7] None None
If need remove NaN
s - missing values first add dropna
:
df = pd.DataFrame(df['col1'].dropna().values.tolist())
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With