Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Pandas data conversion

Tags:

python

pandas

I have the following data in a Pandas dataframe:

AIRPORT
EWR|JAX
EWR|BHX
EWR|BHX
EWR|BHX
EWR|BHX

... Is there a dynamic way to convert this to:

AIRPORT  EWR JAX BHX
EWR|JAX  Y   Y   NULL
EWR|BHX  Y   NULL Y

and so on. I know how to do this if I want to count the hard coded values

 df.assign(EWR = lambda x: x.TYPE.apply(lambda y: y.split('|').count('EWR')))

but I'm hoping not to have to write this code for each airport.

like image 659
Bardiya Choupani Avatar asked Feb 04 '23 10:02

Bardiya Choupani


1 Answers

You can use .str accessor and get_dummies, then using assign with dictionary unpacking to create the additional columns in your dataframe. And, lastly replace to change those 0's and 1's to your str, bool, and nan of choice.

df_out = df.assign(**df.AIRPORT.str.get_dummies().replace({1:'Y',0:np.nan}))
print(df_out)

Output:

   AIRPORT  BHX EWR  JAX
0  EWR|JAX  NaN   Y    Y
1  EWR|BHX    Y   Y  NaN
2  EWR|BHX    Y   Y  NaN
3  EWR|BHX    Y   Y  NaN
4  EWR|BHX    Y   Y  NaN
like image 156
Scott Boston Avatar answered Feb 13 '23 04:02

Scott Boston