Trying to use map on a Spark DataFrame

Tags:

I recently started experimenting with both Spark and Java. I initially went through the famous WordCountexample using RDD and everything went as expected. Now I am trying to implement my own example but using DataFrames and not RDDs.

So I am reading a dataset from a file with

DataFrame df = sqlContext.read()
        .format("com.databricks.spark.csv")
        .option("inferSchema", "true")
        .option("delimiter", ";")
        .option("header", "true")
        .load(inputFilePath);

and then I try to select a specific column and apply a simple transformation to every row like that

df = df.select("start")
        .map(text -> text + "asd");

But the compilation finds a problem with the second row which I don't fully understand (The start column is inferred as of type string).

Multiple non-overriding abstract methods found in interface scala.Function1

Why is my lambda function treated as a Scala function and what does the error message actually mean?

821

asked Mar 02 '17 16:03

LetsPlayYahtzee

1 Answers

If you use the selectfunction on a dataframe you get a dataframe back. Then you apply a function on the Rowdatatype not the value of the row. Afterwards you should get the value first so you should do the following:

df.select("start").map(el->el.getString(0)+"asd")

But you will get an RDD as return value not a DF

100

answered Sep 21 '22 14:09

jojo_Berlin

Related questions
                            
                                Eclipse in OS X uses different version of Java than CLI
                            
                                Android Error: java.net.SocketException: Socket closed
                            
                                Where to put @Nullable on methods with nullable return types? [duplicate]
                            
                                Spring Oauth2. Password encoder is not set in DaoAuthenticationProvider
                            
                                What does replace do if no match is found? (under the hood)
                            
                                Is there a limit of the size of response I can read over HTTP
                            
                                Spring-data findFirstBy throws IncorrectResultSizeDataAccessException?
                            
                                Call a method after the constructor has ended
                            
                                In Java, the variable name can be same with the classname
                            
                                Set default properties in a library with spring-boot
                            
                                retrofit gson converter for nested json with different objects
                            
                                Inject mock into Spring MockMvc WebApplicationContext
                            
                                GoogleApiClient.connect()' was expected to be of type interface but was found to be virtual
                            
                                ERROR: Cannot load this JVM TI agent twice starting Oracle WebLogic Server 12.1.3.0
                            
                                Meaning of Objects.deepEquals method
                            
                                Unit testing: Call @PostConstruct after defining mocked behaviour
                            
                                "Module local" access behaviour in Java 9
                            
                                Constructor ambiguity with varargs in java 8
                            
                                How to generate a Java method reference using Groovy for testing purposes
                            
                                Unit testing a method that takes a ResultSet as parameter

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Trying to use map on a Spark DataFrame

Tags:

java

java-8

apache-spark

apache-spark-sql

spark-dataframe

LetsPlayYahtzee

People also ask

1 Answers

jojo_Berlin

Recent Activity

Donate For Us