apache-spark tutorials and guides

How long does RDD remain in memory?

Apr 03, 2022

apache-spark rdd

Pyspark ML - How to save pipeline and RandomForestClassificationModel

Oct 23, 2022

apache-spark pyspark apache-spark-mllib

Efficient string suffix detection

Jul 16, 2022

python apache-spark pyspark apache-spark-sql string-matching

Spark / Scala: Passing RDD to Function

Feb 26, 2022

scala apache-spark rdd

Why do I have to explicitly tell Spark what to cache?

Oct 04, 2022

apache-spark caching

How to apply a function to a column of a Spark DataFrame?

Oct 26, 2022

scala apache-spark dataframe apache-spark-sql

How do I convert column of unix epoch to Date in Apache spark DataFrame using Java?

Nov 05, 2022

java apache-spark spark-dataframe

Query in Spark SQL inside an array

Dec 05, 2018

apache-spark apache-spark-sql spark-dataframe

Spark list all cached RDD names and unpersist

Feb 23, 2022

java scala dataframe apache-spark rdd

Request insufficient authentication scopes when running Spark-Job on dataproc

Oct 16, 2022

apache-spark google-cloud-platform google-cloud-dataproc

Unresolved reference while trying to import col from pyspark.sql.functions in python 3.5

Apr 06, 2022

python apache-spark pyspark pyspark-sql spark-structured-streaming

IllegalArgumentException thrown when count and collect function in spark

Jul 10, 2022

python macos apache-spark pyspark python-3.6

could not read data from json using pyspark

Nov 16, 2022

apache-spark pyspark

How to add days (as values of a column) to date?

Mar 18, 2022

scala apache-spark apache-spark-sql

No module named graphframes Jupyter Notebook

Oct 28, 2022

python apache-spark graphframes

How to change number of executors in local mode?

Sep 05, 2022

scala apache-spark spark-streaming

partitionBy & overwrite strategy in an Azure DataLake using PySpark in Databricks

Apr 22, 2022

python azure apache-spark apache-spark-sql databricks

How can I pass a list of columns to select in pyspark dataframe?

Apr 11, 2022

python apache-spark pyspark

String to Date migration from Spark 2.0 to 3.0 gives Fail to recognize 'EEE MMM dd HH:mm:ss zzz yyyy' pattern in the DateTimeFormatter

May 19, 2022

apache-spark pyspark apache-spark-sql

Apache Spark - Connection refused for worker

Aug 28, 2022

akka apache-spark

New posts in apache-spark