Continuous trigger not found in Structured Streaming

Tags:

apache-spark

spark-structured-streaming

Runtime: Spark 2.3.0, Scala 2.11 (Databricks 4.1 ML beta)

import org.apache.spark.sql.streaming.Trigger
import scala.concurrent.duration._

//kafka settings and df definition goes here

val query = df.writeStream.format("parquet")
.option("path", ...)
.option("checkpointLocation",...)
.trigger(continuous(30000))
.outputMode(OutputMode.Append)
.start

Throws error not found: value continuous

Other attempts that did not work:

.trigger(continuous = "30 seconds") //as per Databricks blog
// throws same error as above

.trigger(Trigger.Continuous("1 second")) //as per Spark docs
// throws java.lang.IllegalStateException: Unknown type of trigger: ContinuousTrigger(1000)

References:

(Databricks Blog) https://databricks.com/blog/2018/03/20/low-latency-continuous-processing-mode-in-structured-streaming-in-apache-spark-2-3-0.html

(Spark guide) http://spark.apache.org/docs/2.3.0/structured-streaming-programming-guide.html#continuous-processing

(Scaladoc) https://spark.apache.org/docs/2.3.0/api/scala/index.html#org.apache.spark.sql.streaming.package

343

asked Jun 20 '18 15:06

maverik

1 Answers

Spark 2.3.0 does not support parquet under continuous streams, you would have to use streams based on Kafka, console or memory.

To quote the continuous processing mode in structured streaming blog post:

You can set the optional Continuous Trigger in queries that satisfy the following conditions: Read from supported sources like Kafka and write to supported sinks like Kafka, memory, console.

166

answered Oct 05 '22 21:10

Javier Luraschi

Related questions
                            
                                assertion failed: unsafe symbol DeveloperApi in runtime reflection universe
                            
                                How to use ReduceByKey on multiple key in a Scala Spark Job
                            
                                Is there any means to serialize custom Transformer in Spark ML Pipeline
                            
                                Is it possible to set global variables in a Zeppelin Notebook?
                            
                                Does Spark write intermediate shuffle outputs to disk
                            
                                spark - How to reduce the shuffle size of a JavaPairRDD<Integer, Integer[]>?
                            
                                Spark: How to delete a specific variable from spark-shell memory namespace?
                            
                                what is raw prediction in Logistic Regression in spark mllib?
                            
                                Setup and configuration of JanusGraph for a Spark cluster and Cassandra
                            
                                How to start Spark Thrift Server on Datastax Enterprise (fails with java.lang.NoSuchMethodError: ...LogDivertAppender.setWriter)?
                            
                                How to set Kafka parameters from a properties file?
                            
                                How to map rows to protobuf-generated class?
                            
                                Submit a Spark job from C# and get results
                            
                                write a spark Dataset to json with all keys in the schema, including null columns
                            
                                Remove special character from a column in dataframe
                            
                                Spark Dataframe hanging on save
                            
                                SparkR DataFrame partitioning issue
                            
                                spark-shell: strange behavior with import
                            
                                ERROR WHILE RUNNING collect() in PYSPARK
                            
                                Stateful udfs in spark sql, or how to obtain mapPartitions performance benefit in spark sql?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With