I have a pandas series that contains an array for each element, like so:
0 [0, 0]
1 [12, 15]
2 [43, 45]
3 [9, 10]
4 [0, 0]
5 [3, 3]
6 [0, 0]
7 [0, 0]
8 [0, 0]
9 [3, 3]
10 [2, 2]
I want to extract all the first elements, put them in another Series or list and do the same for the second element. I've tried doing regular expression:
mySeries.str.extract(r'\[(\d+), (\d+)\]', expand=True)
and also splitting:
mySeries.str.split(', ').tolist())
both give nan
values. What am I doing wrong?
Case 1
Column of lists
You will need to .tolist
that column and load it into a DataFrame.
pd.DataFrame(df['col'].tolist())
df
col
0 [0, 0]
1 [12, 15]
2 [43, 15]
3 [9, 10]
4 [0, 0]
5 [3, 3]
6 [0, 0]
7 [0, 0]
8 [0, 0]
9 [3, 3]
10 [2, 2]
pd.DataFrame(df['col'].tolist())
0 1
0 0 0
1 12 15
2 43 15
3 9 10
4 0 0
5 3 3
6 0 0
7 0 0
8 0 0
9 3 3
10 2 2
Note: If your data has NaNs, I'd recommend dropping them first: df = df.dropna()
and then proceed as shown above.
Case 2
Column of strings represented as lists
If you have < 100 rows, use:
df['col'] = pd.eval(df['col'])
And then implement case 1. Otherwise, use ast
:
import ast
df['col'] = df['col'].apply(ast.literal_eval)
And proceed as before.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With