Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

New posts in apache-spark

Why am I getting an exception when using a Range Join hint?

could not find function "switch_lang"

Who executes the python codes in pyspark

apache-spark pyspark

How to use Delta with Spark 3.0 Preview?

apache-spark delta-lake

Spark SQL alternatives to groupby/pivot/agg/collect_list using foldLeft & withColumn so as to improve performance

Last Access Time Update in Hive metastore

Read From mongoDB in Scala

mongodb scala apache-spark sbt

Hive table on delta lake

Dataproc does not unpack files passed as Archive

How to process logs from distributed log broker (Eg Kafka) exactly after 1 week?

spark-nlp : DocumentAssembler initializing failing with 'java.lang.NoClassDefFoundError: org/apache/spark/ml/util/MLWritable$class'

Why is Pandas UDF not being parallelized?

Get difference between two version of delta lake table

Spark Structured Streaming program that reads from non-empty Kafka topic (starting from earliest) triggers batches locally, but not on EMR cluster

saveAsTextFile to s3 on spark does not work, just hangs

amazon-s3 apache-spark

parquet.io.ParquetDecodingException: Can not read value at 0 in block -1 in file

java hadoop apache-spark hive

Why does format("kafka") fail with "Failed to find data source: kafka." (even with uber-jar)?

ERROR SparkContext: Error initializing SparkContext. java.net.BindException: Cannot assign requested address: Service 'sparkDriver' failed [duplicate]

scala apache-spark

DataFrame error: "overloaded method value filter with alternatives"

ERROR Utils: Uncaught exception in thread SparkListenerBus

scala apache-spark