apache-spark tutorials and guides

What's the most efficient way to accumulate dataframes in pyspark?

Oct 21, 2022

How to use dataframes within a map function in Spark?

Oct 20, 2022

python apache-spark pyspark

Spark Model to use in Java Application

Oct 21, 2022

java apache-spark apache-spark-mllib

Cassandra + Spark for Real time analytics

Oct 20, 2022

apache-spark cassandra spark-streaming spark-dataframe

Fail to apply mapping on an RDD on multipe spark nodes through Elasticsearch-hadoop library

Oct 20, 2022

scala elasticsearch apache-spark rdd elasticsearch-hadoop

No Java class corresponding to Product with Serializable with Base found

Oct 20, 2022

java scala apache-spark rdd apache-spark-dataset

JavaDStream print RDDs in lambda to console

Oct 20, 2022

java spring apache-spark

Spark Streaming: long queued/active batches

Oct 20, 2022

apache-spark batch-processing spark-streaming

Spark app unable to write to elasticsearch cluster running in docker

Oct 21, 2022

elasticsearch apache-spark docker containers docker-compose

Jackson version is too old

Oct 20, 2022

scala apache-spark sbt

Updating data in database in Spark using Scala

Oct 20, 2022

scala apache-spark

How to tune "spark.rpc.askTimeout"?

Oct 20, 2022

apache-spark spark-streaming

How to Adjust Classification Threshold with a Spark Decision Tree

Oct 21, 2022

apache-spark apache-spark-mllib decision-tree

Why does spark-submit in YARN cluster mode not find python packages on executors?

Oct 20, 2022

python apache-spark pyspark

Specify hbase-site.xml to spark-submit

Oct 20, 2022

scala apache-spark hbase

Categorize using spark sql

Oct 20, 2022

sql database apache-spark

How to return complex types using spark UDFs

Oct 20, 2022

java json apache-spark user-defined-functions udf

"resolved attribute(s) missing" when performing join on pySpark

Sep 28, 2020

apache-spark pyspark spark-dataframe

Sparse Vector vs Dense Vector

Feb 06, 2020

apache-spark apache-spark-mllib

How to get the schema definition from a dataframe in PySpark?

Sep 24, 2022

apache-spark dataframe pyspark schema azure-databricks

New posts in apache-spark