I recently started experimenting with both Spark and Java. I initially went through the famous WordCount
example using RDD
and everything went as expected. Now I am trying to implement my own example but using DataFrames and not RDDs.
So I am reading a dataset from a file with
DataFrame df = sqlContext.read()
.format("com.databricks.spark.csv")
.option("inferSchema", "true")
.option("delimiter", ";")
.option("header", "true")
.load(inputFilePath);
and then I try to select a specific column and apply a simple transformation to every row like that
df = df.select("start")
.map(text -> text + "asd");
But the compilation finds a problem with the second row which I don't fully understand (The start column is inferred as of type string
).
Multiple non-overriding abstract methods found in interface scala.Function1
Why is my lambda function treated as a Scala function and what does the error message actually mean?
Spark map() is a transformation operation that is used to apply the transformation on every element of RDD, DataFrame, and Dataset and finally returns a new RDD/Dataset respectively. In this article, you will learn the syntax and usage of the map() transformation with an RDD & DataFrame example.
Spark Map function takes one element as input process it according to custom code (specified by the developer) and returns one element at a time. Map transforms an RDD of length N into another RDD of length N. The input and output RDDs will typically have the same number of records.
2.1 Using Spark DataTypes. We can create a map column using createMapType() function on the DataTypes class. This method takes two arguments keyType and valueType as mentioned above and these two arguments should be of a type that extends DataType.
If you use the select
function on a dataframe you get a dataframe back. Then you apply a function on the Row
datatype not the value of the row. Afterwards you should get the value first so you should do the following:
df.select("start").map(el->el.getString(0)+"asd")
But you will get an RDD as return value not a DF
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With