apache-spark tutorials and guides

Modify collection inside a Spark RDD foreach

Sep 12, 2022

scala apache-spark rdd

PySpark — UnicodeEncodeError: 'ascii' codec can't encode character

Sep 15, 2022

python python-2.7 apache-spark pyspark

Replace missing values with mean - Spark Dataframe

Sep 15, 2022

scala apache-spark dataframe apache-spark-sql imputation

Spark-Submit: --packages vs --jars

Sep 22, 2022

java scala apache-spark cassandra

How do you perform basic joins of two RDD tables in Spark using Python?

Aug 29, 2022

python join apache-spark pyspark rdd

Spark RDD default number of partitions

Oct 19, 2022

scala apache-spark

How can I get the current SparkSession in any place of the codes?

Aug 01, 2022

scala apache-spark

Not able to import Spark Implicits in ScalaTest

Sep 15, 2022

scala apache-spark apache-spark-sql implicit scalatest

How to read only n rows of large CSV file on HDFS using spark-csv package?

Sep 15, 2022

apache-spark pyspark hdfs apache-spark-sql spark-csv

How to convert column of arrays of strings to strings?

Sep 15, 2022

apache-spark apache-spark-sql

setting SparkContext for pyspark

Sep 19, 2022

python apache-spark pyspark

pyspark dataframe add a column if it doesn't exist

Sep 14, 2022

apache-spark pyspark apache-spark-sql pyspark-sql

Why is the error "Unable to find encoder for type stored in a Dataset" when encoding JSON using case classes?

Nov 28, 2021

scala apache-spark apache-spark-dataset apache-spark-encoders

How to check if list contains all the same values?

Oct 26, 2022

scala list apache-spark

Show partitions on a pyspark RDD

Sep 14, 2022

python apache-spark pyspark

How to resolve external packages with spark-shell when behind a corporate proxy?

Sep 27, 2022

apache-spark proxy dependencies ivy

How to create hive table from Spark data frame, using its schema?

Sep 14, 2022

scala apache-spark hive

How to get the number of elements in partition? [duplicate]

Sep 14, 2022

apache-spark partitioning

Stratified sampling with pyspark

Sep 14, 2022

apache-spark pyspark apache-spark-sql

How to augment matrix factors in Spark ALS recommender? [duplicate]

Sep 14, 2022

python machine-learning apache-spark

New posts in apache-spark