I have dataframe in which I have about 1000s ( variable) columns.
I want to make all values upper case.
Here is the approach I have thought of , can you suggest if this is best way.
By using apply() you call a function to every row of pandas DataFrame. Here the add() function will be applied to every row of pandas DataFrame. In order to iterate row by row in apply() function use axis=1 .
The syntax for Pyspark Apply Function to ColumnThe Import is to be used for passing the user-defined function. B:- The Data frame model used and the user-defined function that is to be passed for the column name. It takes up the column name as the parameter, and the function can be passed along.
Python's Pandas Library provides an member function in Dataframe class to apply a function along the axis of the Dataframe i.e. along each row or column i.e. Important Arguments are: func : Function to be applied to each column or row. This function accepts a series and returns a series.
Method 6: Using select() with collect() method This method is used to select a particular row from the dataframe, It can be used with collect() function. where, dataframe is the pyspark dataframe. Columns is the list of columns to be displayed in each row.
I needed to do similar but had to write my own function to convert empty strings within a dataframe to null. This is what I did.
import org.apache.spark.sql.functions.{col, udf}
import spark.implicits._
def emptyToNull(_str: String): Option[String] = {
_str match {
case d if (_str == null || _str.trim.isEmpty) => None
case _ => Some(_str)
}
}
val emptyToNullUdf = udf(emptyToNull(_: String))
val df = Seq(("a", "B", "c"), ("D", "e ", ""), ("", "", null)).toDF("x", "y", "z")
df.select(df.columns.map(c => emptyToNullUdf(col(c)).alias(c)): _*).show
+----+----+----+
| x| y| z|
+----+----+----+
| a| B| c|
| D| e |null|
|null|null|null|
+----+----+----+
Here's a more refined function of emptyToNull using options instead of null.
def emptyToNull(_str: String): Option[String] = Option(_str) match {
case ret @ Some(s) if (s.trim.nonEmpty) => ret
case _ => None
}
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With