apache-spark tutorials and guides

filter rows for column value in list of words pyspark

May 26, 2026

python apache-spark pyspark

How to build a graph from a dataframe ? (GraphX)

May 25, 2026

scala apache-spark dataframe graph spark-graphx

kinit: Client's credentials have been revoked while getting initial credentials

May 26, 2026

hadoop apache-spark active-directory kerberos hortonworks-data-platform

Spark cartesian doesn't cause shuffle?

May 26, 2026

apache-spark pyspark rdd concept

Union can only be performed on tables with the compatible column types Spark dataframe

May 26, 2026

scala apache-spark dataframe apache-spark-sql union

Auto increment id in delta table while inserting

May 26, 2026

apache-spark pyspark apache-spark-sql delta-lake

Pyspark: How to check whether a file path with wild character exists in s3

May 26, 2026

amazon-web-services apache-spark amazon-s3 pyspark

Explode multiple columns from nested JSON but it is giving extra records

May 24, 2026

json apache-spark pyspark apache-spark-sql nested

Worker could not connect to master (invalid assocation) on same machine - even though url is correct

May 26, 2026

apache-spark

Cassandra query flexibility

May 25, 2026

hadoop cassandra apache-spark bigdata cql

How to make Spark Streaming write its output so that Impala can read it?

May 25, 2026

apache-spark hadoop hive spark-streaming impala

Decimal Type with Precision Equivalent in Spark SQL

May 25, 2026

apache-spark apache-spark-sql

Creating dictionary from Pyspark dataframe showing OutOfMemoryError: Java heap space

May 25, 2026

java python apache-spark pyspark

PySpark Streaming example does not seem to terminate

May 24, 2026

python apache-spark spark-streaming pyspark

Error when broadcasting Joda DateTime in Spark

May 25, 2026

scala apache-spark jodatime

Exception: Java gateway process exited before sending the driver its port number while creating a Spark Session in Python

May 25, 2026

java python python-2.7 apache-spark pyspark

Merge Maps in scala dataframe

May 25, 2026

scala dataframe apache-spark user-defined-functions scala-collections

New posts in apache-spark