Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python: How can I extend a DataFrame with multiply fields that calculated from a column

Tags:

python

pandas

I have a datadrame which looks like:

     A    B 
0  2.0  'C=4;D=5;'
1  2.0  'C=4;D=5;'
2  2.0  'C=4;D=5;'

I can parse the string in column B, lets say using a function name parse_col(), in to a dict that looks like:

{C: 4, D: 5} 

How can I add the 2 extra column to the data frame so it would look like that:

     A    B          C   D
0  2.0  'C=4;D=5;'   4   5
1  2.0  'C=4;D=5;'   4   5
2  2.0  'C=4;D=5;'   4   5

I can take only the specific column, parse it and add it but its clearly not the best way.
I also tried using a variation of the example in pandas apply documentation but I didn't manage to make it work only on a specific column.

like image 620
Green Avatar asked Oct 31 '25 10:10

Green


2 Answers

We can use Series.str.extractall and then chain it with unstack to pivot the rows to columns:

df[['C', 'D']] = df['B'].str.extractall('(\d+)').unstack()

     A           B  C  D
0  2.0  'C=4;D=5;'  4  5
1  2.0  'C=4;D=5;'  4  5
2  2.0  'C=4;D=5;'  4  5
like image 116
Erfan Avatar answered Nov 04 '25 12:11

Erfan


You can use df.eval and functools.reduce, this way you can read the column names directly:

>>> from functools import reduce
>>> reduce(
            lambda x,y: x.eval(y),
            df.B.str
                .extractall(r'([A-Za-z]=\d+)')
                .unstack().xs(0), df
            )

     A           B  C  D
0  2.0  'C=4;D=5;'  4  5
1  2.0  'C=4;D=5;'  4  5
2  2.0  'C=4;D=5;'  4  5
like image 38
Sayandip Dutta Avatar answered Nov 04 '25 13:11

Sayandip Dutta