I want to do something like this:
df.replace('empty-value', None, 'NAME')
Basically, I want to replace some value with NULL. but it does not accept None in this function. How can I do this?
In order to replace empty value with None/null on single DataFrame column, you can use withColumn() and when(). otherwise() function.
The replacement of null values in PySpark DataFrames is one of the most common operations undertaken. This can be achieved by using either DataFrame. fillna() or DataFrameNaFunctions. fill() methods.
By using PySpark SQL function regexp_replace() you can replace a column value with a string for another string/substring. regexp_replace() uses Java regex for matching, if the regex does not match it returns an empty string, the below example replace the street name Rd value with Road string on address column.
You can combine when
clause with NULL
literal and types casting as follows:
from pyspark.sql.functions import when, lit, col df = sc.parallelize([(1, "foo"), (2, "bar")]).toDF(["x", "y"]) def replace(column, value): return when(column != value, column).otherwise(lit(None)) df.withColumn("y", replace(col("y"), "bar")).show() ## +---+----+ ## | x| y| ## +---+----+ ## | 1| foo| ## | 2|null| ## +---+----+
It doesn't introduce BatchPythonEvaluation
and because of that should be significantly more efficient than using an UDF.
This will replace empty-value
with None
in your name
column:
from pyspark.sql.functions import udf from pyspark.sql.types import StringType df = sc.parallelize([(1, "empty-value"), (2, "something else")]).toDF(["key", "name"]) new_column_udf = udf(lambda name: None if name == "empty-value" else name, StringType()) new_df = df.withColumn("name", new_column_udf(df.name)) new_df.collect()
Output:
[Row(key=1, name=None), Row(key=2, name=u'something else')]
By using the old name as the first parameter in withColumn
, it actually replaces the old name
column with the new one generated by the UDF output.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With