Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Structured Streaming exception: Append output mode not supported for streaming aggregations

I am getting the following error when I run my spark job:

org.apache.spark.sql.AnalysisException: Append output mode not supported when there are streaming aggregations on streaming DataFrames/DataSets;;

I am not sure if the issue is being caused due to lack of a watermark,which I don't know how to apply in this context. Following is the aggregation operation applied:

def aggregateByValue(): DataFrame = {
  df.withColumn("Value", expr("(BookingClass, Value)"))
    .groupBy("AirlineCode", "Origin", "Destination", "PoS", "TravelDate", "StartSaleDate", "EndSaleDate", "avsFlag")
    .agg(collect_list("Value").as("ValueSeq"))
    .drop("Value")
}

Usage:

val theGroupedDF = theDF
  .multiplyYieldByHundred
  .explodeDates
  .aggregateByValue

val query = theGroupedDF.writeStream
  .outputMode("append")
  .format("console")
  .start()
query.awaitTermination()
like image 793
ChiralCarbon Avatar asked Jan 29 '23 16:01

ChiralCarbon


1 Answers

Changing the outputMode to complete solved the issue.

val query = theGroupedDF.writeStream
  .outputMode("complete")
  .format("console")
  .start()
query.awaitTermination()
like image 129
ChiralCarbon Avatar answered Feb 16 '23 01:02

ChiralCarbon