Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to subtract two columns of pyspark dataframe and also divide?

I have dataframe like this..

dd1 : -

    A    B   
   2112  2637
   1293  2251
   1779  2435
   935   2473

I want to substract col B from col A and divide that ans by col A. Like this

    A    B       Result 
   2112  2637    -0.24
   1293  2251    -0.74
   1779  2435    -0.36
   935   2473   -1.64

Like (2112-2637)/2112 = -0.24

If it is not possible directly then 1st we can perform substract operation and store it new col then divide that col and store in another col.

like image 674
vishwajeet Mane Avatar asked Sep 12 '25 20:09

vishwajeet Mane


1 Answers

General idea is like following:

dd1['Result'] = ( dd1['A'] - dd1['B'] ) / dd1['A']

In case of Pyspark, it would look something like:

dd1 = dd1.withColumn('Result', ( dd1['A'] - dd1['B'] ) / dd1['A'] )
like image 118
oreopot Avatar answered Sep 16 '25 09:09

oreopot



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!