apache-spark tutorials and guides

How to read a CSV file with commas within a field using pyspark? [duplicate]

Sep 16, 2025

Connect PySpark to Kafka from Docker container

Sep 16, 2025

docker apache-spark pyspark apache-kafka docker-compose

PySpark Pipeline Error when using Indexer and Encoder

Sep 16, 2025

python apache-spark pyspark pipeline apache-spark-ml

How to install apache-spark 2.3.3 with homebrew on Mac

Sep 15, 2025

apache-spark homebrew

Packaging like jar for pyspark

Sep 16, 2025

python apache-spark jar pyspark hadoop-yarn

AnalysisException: It is not allowed to add database prefix

Sep 16, 2025

apache-spark apache-spark-sql

How can I convert a spark dataframe column, containing serialized json, into a dataframe itself?

Sep 14, 2025

json apache-spark pyspark

Spark master won't show running application in UI when I use spark-submit for python script

Sep 16, 2025

apache-spark apache-spark-standalone

How to filter by date range in Spark SQL

Sep 16, 2025

scala apache-spark apache-spark-sql

Setting Environment variables in Spark Cluster Mode

Sep 16, 2025

apache-spark environment-variables hadoop-yarn

Spark scala mocking spark.implicits for unit testing

Sep 15, 2025

scala unit-testing apache-spark mockito implicit

How to create table under a schema in a database

Sep 15, 2025

apache-spark databricks

Can we consume JMS messages from a Topic through Spark Streaming?

Sep 15, 2025

scala apache-spark spark-streaming

Convert Spark Structure Streaming DataFrames to Pandas DataFrame

Sep 15, 2025

python pandas apache-spark pyspark spark-structured-streaming

Is there any an issue with the file name openjdk-8-jdk-headless?

Sep 15, 2025

java python-3.x apache-spark ubuntu ubuntu-18.04

Spark making expensive S3 API calls

Sep 14, 2025

apache-spark amazon-s3 databricks

FileNotFoundException (stderr & stdout) when submitting JAR to Spark in EMR environment

Sep 15, 2025

scala apache-spark amazon-s3 jar amazon-emr

How do I improve loading thousands of tiny JSON files into a Spark dataframe?

Sep 14, 2025

json apache-spark

Spark Python Avro Kafka Deserialiser

Sep 15, 2025

python apache-spark apache-kafka avro spark-streaming

Adding a system dependency to Maven

Sep 14, 2025

java maven apache-spark

New posts in apache-spark