apache-spark tutorials and guides

Why spark executor receives SIGTERM?

Mar 23, 2022

apache-spark signals

Spark ML - MulticlassClassificationEvaluator - can we get precision/recall by each class label?

Nov 11, 2022

apache-spark machine-learning apache-spark-ml multiclass-classification

Is proper event-time sessionization possible with Spark Structured Streaming?

Mar 21, 2022

apache-spark apache-spark-sql spark-structured-streaming

Python Spark Dataframes: Better way to export groups to text file

Nov 09, 2018

python apache-spark dataframe

Proper save/load of MatrixFactorizationModel

Jan 04, 2022

apache-spark apache-spark-mllib

How does Spark send closures to workers?

Oct 24, 2022

apache-spark

Pyspark: applying kmeans on different groups of a dataframe

Feb 11, 2022

apache-spark group-by pyspark k-means

Structured streaming - Metrics in Grafana

Oct 14, 2022

apache-spark apache-spark-sql graphite spark-structured-streaming

Spark accumulator not displayed in spark WebUI

Aug 17, 2022

apache-spark

how to redirect Scala Spark Dataset.show to log4j logger

May 06, 2021

scala logging apache-spark dataset

Applying Python function to Pandas grouped DataFrame - what's the most efficient approach to speed up the computations?

Feb 27, 2022

python pandas apache-spark parallel-processing dask

Using SparkR JVM to call methods from a Scala jar file

Jan 22, 2021

r scala apache-spark apache-spark-sql sparkr

Sorting JavaPairRDD first by value and then by key

Apr 08, 2019

java hadoop apache-spark

How to protect password and username in Spark (such as for JDBC connections/accessing RDBMS databases)?

Nov 15, 2022

apache-spark apache-spark-sql

How do I get independent service Zeppelin to see Hive?

Sep 11, 2022

apache-spark hive hortonworks-data-platform apache-zeppelin

Spark nodes keep printing GC (Allocation Failure) and no tasks run

Nov 18, 2022

scala apache-spark hadoop livy

Apache Spark 2.0: java.lang.UnsupportedOperationException: No Encoder found for java.time.LocalDate

Mar 14, 2022

scala apache-spark apache-spark-sql apache-spark-dataset apache-spark-encoders

Unable to create array literal in spark/pyspark

Aug 22, 2022

apache-spark pyspark

How to know which stage of a job is currently running in Apache Spark?

Nov 12, 2022

java scala apache-spark bigdata

Using Spark Structured Streaming with Trigger.Once

Feb 02, 2022

scala apache-spark spark-structured-streaming

New posts in apache-spark