Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

In pyspark, is it possible to fillna with another column?

Let's say there is a RDD that looks like this:

+----+--------------+-----+
| age|best_guess_age| name|
+----+--------------+-----+
|  23|            23|Alice|
|null|            18|  Bob|
|  34|            32|  Tom|
|null|            40|Linda|
+----+--------------+-----+

Where we want to fill the age column with best_guess_age column whenever it is null.

The fillna command requires an actual value to replace the na's, we can't simply pass in a column.

How to do this?

like image 534
foobar Avatar asked Aug 21 '18 15:08

foobar


People also ask

How do you fillNa a column in PySpark?

In PySpark, DataFrame. fillna() or DataFrameNaFunctions. fill() is used to replace NULL/None values on all or selected multiple DataFrame columns with either zero(0), empty string, space, or any constant literal values.

What does fillNa do in PySpark?

The Fill Na function finds up the null value for a given data frame in PySpark and then fills the value out of it that is passed as an argument. The value can be passed to the data frame that finds the null value and applies the value out of it. The fillNa value replaces the null value and it is an alias for na.

How do you switch between two columns in PySpark?

one way is to copy columns [o, o_type] into temporary columns ['o_temp','o_type_temp'] and then copy the values of [s,s_type] into [o,o_type] and finally ['o_temp','o_type_temp'] into [s,s_type] .

How do I overwrite a column in Spark?

You can replace column values of PySpark DataFrame by using SQL string functions regexp_replace(), translate(), and overlay() with Python examples.


1 Answers

You can use coalesce function; By doing coalesce('age', 'best_guess_age'), it will take values from age column if it's not null, otherwise from best_guess_age column:

from pyspark.sql.functions import coalesce
df.withColumn('age', coalesce('age', 'best_guess_age')).show()
+---+--------------+-----+
|age|best_guess_age| name|
+---+--------------+-----+
| 23|            23|Alice|
| 18|            18|  Bob|
| 34|            32|  Tom|
| 40|            40|Linda|
+---+--------------+-----+
like image 65
Psidom Avatar answered Nov 15 '22 04:11

Psidom