Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Spark Session returned an error : Apache NiFi

We are trying to run a spark program using NiFi. This is the basic sample we tried to follow.

We have configured Apache-Livy server in 127.0.0.1:8998.

ExecutiveSparkInteractive processor is used to run sample Spark code.

val gdpDF = spark.read.json("gdp.json")
val gdpRDD = gdpDF.rdd
gdpRDD.count()

LivyController is confiured for 127.0.0.1 port 8998 and Session Type : spark.

When we run the processor we get following error :

Spark Session returned an error, sending the output JSON object as the flow file content to failure (after penalizing)

We just want to output the line count in JSON file. How to redirect it to flowfile?

NiFi User log :

2020-04-13 21:50:49,955 INFO [NiFi Web Server-85] org.apache.nifi.web.filter.RequestLogger Attempting request for (anonymous) GET http://localhost:9090/nifi-api/flow/controller/bulletins (source ip: 127.0.0.1)

NiFi app.log

ERROR [Timer-Driven Process Thread-3] o.a.n.p.livy.ExecuteSparkInteractive ExecuteSparkInteractive[id=9a338053-0173-1000-fbe9-e613558ad33b] Spark Session returned an error, sending the output JSON object as the flow file content to failure (after penalizing)

like image 711
Sachith Muhandiram Avatar asked Apr 03 '20 11:04

Sachith Muhandiram


1 Answers

I have seen several people struggling with this example. I recommend following this example from the Cloudera Community (especially note part 2). https://community.cloudera.com/t5/Community-Articles/HDF-3-1-Executing-Apache-Spark-via-ExecuteSparkInteractive/ta-p/247772

The key points I would be concerned with:

  1. Does your spark work in general
  2. Does your livy work in general
  3. Is the Spark sample code good
like image 64
Dennis Jaheruddin Avatar answered Nov 13 '22 23:11

Dennis Jaheruddin