apache-spark tutorials and guides

Dependency issue with Pyspark running on Kubernetes using spark-on-k8s-operator

Sep 20, 2022

How can I inspect per executor/node memory usage metrics of a pyspark job on Dataproc?

Mar 29, 2022

apache-spark google-cloud-platform pyspark hadoop-yarn google-cloud-dataproc

How to pass array column as argument in VectorUdf in .Net Spark?

May 25, 2022

c# apache-spark user-defined-functions apache-arrow .net-spark

How to read gz files in Spark using wholeTextFiles

Nov 05, 2022

hadoop apache-spark gzip

How to submit Apache Spark job to Hadoop YARN on Azure HDInsight

Jun 17, 2022

azure apache-spark azure-hdinsight

Apache Spark network ports configuration

Oct 17, 2022

java tomcat apache-spark

Spark give Null pointer exception during InputSplit for Hbase

Aug 12, 2018

scala hadoop mapreduce hbase apache-spark

java.lang.StackOverflowError when using Kryo to serialize objects with references to each other

Jul 05, 2022

java apache-spark kryo kryonet

In Spark Streaming, how to detect for an empty batch?

Aug 21, 2022

apache-spark

Spark Streaming Bug - Window of Windowed DStream does not work

Nov 19, 2022

apache-spark spark-streaming

Getting java.lang.IllegalArgumentException: requirement failed while calling Sparks MLLIB StreamingKMeans from java application

Mar 18, 2020

java apache-spark bigdata hadoop2 spark-streaming

Batch Size in Spark Streaming

Aug 30, 2022

scala twitter apache-spark twitter4j spark-streaming

Partitions not being pruned in simple SparkSQL queries

Sep 13, 2022

amazon-s3 apache-spark apache-spark-sql pyspark parquet

Multiple windows of different durations in Spark Streaming application

May 24, 2018

apache-spark real-time analytics apache-kafka spark-streaming

Failed to load class for data source: com.databricks.spark.csv

Apr 08, 2021

apache-spark

Spark JoinWithCassandraTable on TimeStamp partition key STUCK

Aug 31, 2022

mysql scala cassandra apache-spark datastax-enterprise

Using TestHiveContext/HiveContext in unit tests

Jun 29, 2021

apache-spark hive apache-spark-sql hivecontext

Locally change the log level for the zookeeper C client

Aug 17, 2022

logging apache-spark apache-zookeeper mesos

Spark mapWithState shuffles all data to one node

Nov 06, 2022

scala apache-spark spark-streaming

How to give predicted and label columns in BinaryClassificationMetrics evaluation for Naive Bayes model

Dec 20, 2019

scala apache-spark machine-learning apache-spark-mllib apache-spark-ml

New posts in apache-spark