apache-spark tutorials and guides

Filtering rows with empty arrays in PySpark

Nov 14, 2022

Spark read s3 using sc.textFile("s3a://bucket/filePath"). java.lang.NoSuchMethodError: com.amazonaws.services.s3.transfer.TransferManager

Jan 14, 2021

apache-spark amazon-s3

DataFrame columns names conflict with .(dot)

Oct 12, 2022

scala apache-spark apache-spark-sql

How to make it easier to deploy my Jar to Spark Cluster in standalone mode?

Nov 05, 2022

jar apache-spark

Spark : How to use mapPartition and create/close connection per partition

Oct 28, 2022

scala apache-spark rdd

Why does conf.set("spark.app.name", appName) not set the name in the UI?

Apr 15, 2022

apache-spark

spark - scala: not a member of org.apache.spark.sql.Row

Apr 28, 2022

scala apache-spark apache-spark-sql rdd spark-dataframe

calculating percentages on a pyspark dataframe

Nov 11, 2022

apache-spark pyspark spark-dataframe

SparkSQL and explode on DataFrame in Java

Nov 07, 2022

java apache-spark apache-spark-sql

Pyspark dataframe how to drop rows with nulls in all columns?

Sep 14, 2022

python apache-spark pyspark apache-spark-sql pyspark-sql

Spark Select with a List of Columns Scala

Aug 28, 2022

scala apache-spark

How to overwrite Spark ML model in PySpark?

Aug 30, 2022

apache-spark machine-learning pyspark apache-spark-mllib apache-spark-ml

Pyspark AWS credentials

Sep 19, 2022

amazon-web-services apache-spark amazon-s3 pyspark

How to get nth row of Spark RDD?

Nov 11, 2022

hadoop apache-spark rdd

Removing punctuation marks form text in Scala - Spark

Oct 26, 2022

regex scala apache-spark punctuation

Add a new column to a Dataframe. New column i want it to be a UUID generator

Sep 25, 2022

apache-spark apache-spark-sql uuid

The SPARK_HOME env variable is set but Jupyter Notebook doesn't see it. (Windows)

Apr 24, 2022

python-3.x apache-spark pyspark

How to improve broadcast Join speed with between condition in Spark

Apr 12, 2022

apache-spark apache-spark-sql

How to use lag and rangeBetween functions on timestamp values?

Sep 16, 2022

apache-spark pyspark apache-spark-sql window-functions

Spark: Joining with array

Nov 08, 2022

scala apache-spark apache-spark-sql

New posts in apache-spark