if else in pyspark for collapsing column values

Tags:

I am trying a simple code to collapse my categorical variables in dataframe to binary classes after indexing currently my column has 3 classes- "A","B","C" I am writing a simple if else statement to collapse classes like

def condition(r):
if (r.wo_flag=="SLM" or r.wo_flag=="NON-SLM"):
    r.wo_flag="dispatch" 
else: 
    r.wo_flag="non_dispatch" 
return r.wo_flag 

df_final=df_new.map(lambda x: condition(x))

Its not working it doesn't understand the else condition

|MData|Recode12|Status|DayOfWeekOfDispatch|MannerOfDispatch|Wo_flag|PlaceOfInjury|Race|
     M|      11|     M|                  4|               7|      C|           99| 1  |    
     M|       8|     D|                  3|               7|      A|           99| 1  |
     F|      10|     W|                  2|               7|      C|           99| 1  |
     M|       9|     D|                  1|               7|      B|           99| 1  |
     M|       8|     D|                  2|               7|      C|           99| 1  |

This is the Sample Data

754

asked May 04 '16 20:05

Shweta Kamble

2 Answers

The accepted answer is not very efficient due to the use of a user defined function (UDF).

I think most people are looking for when.

from pyspark.sql.functions import when

matches = df["wo_flag"].isin("SLM", "NON-SLM")
new_df = df.withColumn("wo_flag", when(matches, "dispatch").otherwise("non-dispatch"))

answered Oct 16 '22 16:10

mcskinner

Try this :

from pyspark.sql.types import StringType
from pyspark.sql.functions import udf

def modify_values(r):
    if r == "A" or r =="B":
        return "dispatch"
    else:
        return "non-dispatch"
ol_val = udf(modify_values, StringType())
new_df = df.withColumn("wo_flag",ol_val(df.wo_flag))

Things you are doing wrong:

You are trying to modify Rows (Rows are immmutable)
When a map operation is done on a dataframe , the resulting data structure is a PipelinedRDD and not a dataframe . You have to apply .toDF() to get dataframe

answered Oct 16 '22 15:10

Himaprasoon

Related questions
                            
                                Avoid IF statement after condition has been met
                            
                                `IF` statement with 3 possible answers each based on 3 different ranges
                            
                                pythonic way to rewrite an assignment in an if statement
                            
                                SQL Server Inline CASE WHEN ISNULL and multiple checks
                            
                                jquery - if href attr == ""
                            
                                Test for multiple values in an if statement in C# [duplicate]
                            
                                IF Statement not working in Powershell
                            
                                If statement is being missed in Bank Account Class?
                            
                                Multiple OR or AND conditions in IF statement
                            
                                Can I check which the previous activity was, Android?
                            
                                How to use java empty HashSet in if-statement?
                            
                                Formula for computing total amount per day
                            
                                How would you express this in Haskell?
                            
                                How to avoid short-circuited evaluation in JavaScript?
                            
                                Python - check for class existance
                            
                                Alternative to if statement in this simple example
                            
                                Using a scalar as a condition in perl
                            
                                Exit from if block in Javascript
                            
                                Logic of Java Operators && and ||
                            
                                If statements without brackets

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

if else in pyspark for collapsing column values

Tags:

dataframe

conditional-statements

if-statement

pyspark

Shweta Kamble

People also ask

2 Answers

mcskinner

Himaprasoon

Recent Activity

Donate For Us