apache-spark tutorials and guides

Where is Spark Streamings state stored?

Apr 25, 2026

apache-spark spark-streaming

Local Kafka Application failing with: NoSuchMethodError: createEphemeral

Apr 26, 2026

apache-spark apache-kafka producer-consumer apache-zookeeper

How to count the number of occurence of a key in pyspark dataframe (2.1.0)

Apr 25, 2026

python apache-spark pyspark apache-spark-2.0

Dynamically select multiple columns while joining different Dataframe in Scala Spark

Apr 25, 2026

scala apache-spark dataframe apache-spark-sql

NoSuchMethodError while running Spark Streaming job on HDP 2.2

Apr 25, 2026

scala apache-spark hortonworks-data-platform spark-streaming

why spark sort is slower than scala original sort method

Apr 24, 2026

scala sorting apache-spark

Spark structured streaming of Kafka protobuf

Apr 25, 2026

scala apache-spark protocol-buffers spark-streaming scalapb

Apache Spark write to MySQL with JDBC connector (Write Mode: Ignore) is not performing as expected [duplicate]

Apr 24, 2026

mysql apache-spark jdbc pyspark apache-spark-sql

How to pass DataSet(s) to a function that accepts DataFrame(s) as arguments in Apache Spark using Scala?

Apr 25, 2026

scala apache-spark apache-spark-sql apache-spark-dataset

How to implement a custom Pyspark explode (for array of structs), 4 columns in 1 explode?

Apr 23, 2026

python-3.x apache-spark pyspark apache-spark-sql

Add batch number to DataFrame based on moving sum in spark

Apr 23, 2026

python dataframe apache-spark pyspark

spark streaming DirectKafkaInputDStream: kafka data source can easily stress the driver node

Apr 24, 2026

apache-spark apache-kafka spark-streaming

dynamic partition pruning not clear

Apr 24, 2026

apache-spark apache-spark-sql

Does Spark streaming support to Kafka 1.1.0 now?

Apr 24, 2026

apache-spark

hbase-spark for Spark 2

Apr 23, 2026

scala apache-spark hbase

Apache Spark java heap space error during matrix multiplication

Apr 24, 2026

java apache-spark

Spark: TreeAgregate at IDF is taking ages

Apr 24, 2026

apache-spark

New posts in apache-spark