I am trying to use a "chained when" function. In other words, I'd like to get more than two outputs.
I tried using the same logic of the concatenate IF function in Excel:
df.withColumn("device_id", when(col("device")=="desktop",1)).otherwise(when(col("device")=="mobile",2)).otherwise(null))
But that doesn't work since I can't put a tuple into the "otherwise" function.
Have you tried:
from pyspark.sql import functions as F
df.withColumn('device_id', F.when(col('device')=='desktop', 1).when(col('device')=='mobile', 2).otherwise(None))
Note that when chaining when
functions you do not need to wrap the successive calls in an otherwise
function.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With