Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Explode column of lists into multiple columns

I have a pandas series that contains an array for each element, like so:

0            [0, 0]
1          [12, 15]
2          [43, 45]
3           [9, 10]
4            [0, 0]
5            [3, 3]
6            [0, 0]
7            [0, 0]
8            [0, 0]
9            [3, 3]
10           [2, 2]

I want to extract all the first elements, put them in another Series or list and do the same for the second element. I've tried doing regular expression:

mySeries.str.extract(r'\[(\d+), (\d+)\]', expand=True)

and also splitting:

mySeries.str.split(', ').tolist())

both give nan values. What am I doing wrong?

like image 470
Courtney White Avatar asked May 10 '18 02:05

Courtney White


1 Answers

Case 1
Column of lists
You will need to .tolist that column and load it into a DataFrame.

pd.DataFrame(df['col'].tolist())

df
         col
0     [0, 0]
1   [12, 15]
2   [43, 15]
3    [9, 10]
4     [0, 0]
5     [3, 3]
6     [0, 0]
7     [0, 0]
8     [0, 0]
9     [3, 3]
10    [2, 2]

pd.DataFrame(df['col'].tolist())

     0   1
0    0   0
1   12  15
2   43  15
3    9  10
4    0   0
5    3   3
6    0   0
7    0   0
8    0   0
9    3   3
10   2   2

Note: If your data has NaNs, I'd recommend dropping them first: df = df.dropna() and then proceed as shown above.


Case 2
Column of strings represented as lists

If you have < 100 rows, use:

df['col'] = pd.eval(df['col'])

And then implement case 1. Otherwise, use ast:

import ast
df['col'] = df['col'].apply(ast.literal_eval)

And proceed as before.

like image 156
cs95 Avatar answered Sep 22 '22 17:09

cs95