I want below mentioned data using Spark (2.2) dataset
Name Age Age+5
A 10 15
B 5 10
C 25 30
I tried using the following :
dataset.select(
dataset.col("Name"),
dataset.col("Age),
dataset.col( dataset.selectExpr("Age"+5).toString() )
);
This throws exception as Age
column not found.
Therefore, select() method is useful when you simply need to select a subset of columns from a particular Spark DataFrame. On the other hand, selectExpr() comes in handy when you need to select particular columns while at the same time you also need to apply some sort of transformation over particular column(s).
You can select the single or multiple columns of the Spark DataFrame by passing the column names you wanted to select to the select() function. Since DataFrame is immutable, this creates a new DataFrame with a selected columns. show() function is used to show the DataFrame contents.
mode (default PERMISSIVE ): allows a mode for dealing with corrupt records during parsing. PERMISSIVE : sets other fields to null when it meets a corrupted record, and puts the malformed string into a new field configured by columnNameOfCorruptRecord .
selectExpr
has the definition :
public Dataset<Row> selectExpr(String... exprs)
It takes varargs String as it's parameter. So, you can just use :
dataset.selectExpr( "Name", "Age", "Age+5" )
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With