I have a sample dataframe like this, Column: ID and Main
ID,Main
0,[30 115 266 38;662 99 1199 43] [511 133 25 47] [664 162 49 22]
How do I make my dataframe something like below using pandas
Expected Output
ID,Main
0,30 115 266 38
0,662 99 1199 43
0,511 133 25 47
0,664 162 49 22
First replace ; by ][ and then extract values between [] by findall for Series of lists.
Last create DataFrame, reshape by stack with some data cleaning by reset_index:
s = df['Main'].fillna('').str.replace(';','][').str.findall('\[(.*?)\]')
df = (pd.DataFrame(s.values.tolist(), index=s.index)
.stack()
.reset_index(level=1, drop=True)
.reset_index())
df.columns = ['ID','Main']
print (df)
ID Main
0 0 30 115 266 38
1 0 662 99 1199 43
2 0 511 133 25 47
3 0 664 162 49 22
Another solution for Series:
s = df['Main'].fillna('').str.strip('[]').str.split(';|\]\s+\[')
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With