apache-spark-sql tutorials

Aggregate over column arrays in DataFrame in PySpark?

Dec 20, 2022

Spark: How can DataFrame be Dataset[Row] if DataFrame's have a schema

Dec 20, 2022

scala apache-spark apache-spark-sql apache-spark-dataset

Apply a custom Spark Aggregator on multiple columns (Spark 2.0)

Dec 20, 2022

apache-spark apache-spark-sql aggregate-functions user-defined-functions

How to create UDF from Scala methods (to compute md5)?

Dec 20, 2022

scala apache-spark apache-spark-sql udf

Use "IS IN" between 2 Spark dataframe columns

Dec 20, 2022

apache-spark pyspark apache-spark-sql

Split column of list into multiple columns in the same PySpark dataframe

Dec 20, 2022

pyspark apache-spark-sql

How to interpolate a column within a grouped object in PySpark?

Dec 20, 2022

apache-spark pyspark apache-spark-sql interpolation

Removing non-ascii and special character in pyspark dataframe column

Dec 20, 2022

python pyspark apache-spark-sql azure-databricks

Spark udf initialization

Dec 16, 2022

scala apache-spark apache-spark-sql user-defined-functions

Add a column to a Spark DataFrame and calculate a value for it

Dec 17, 2022

apache-spark apache-spark-sql

Spark dataframe is not ordered after sort

Dec 16, 2022

apache-spark apache-spark-sql

MatchError while accessing vector column in Spark 2.0

Dec 17, 2022

scala apache-spark apache-spark-sql apache-spark-mllib apache-spark-ml

How to use CROSS JOIN and CROSS APPLY in Spark SQL

Dec 17, 2022

scala apache-spark apache-spark-sql

TypeError: 'Builder' object is not callable Spark structured streaming

Dec 16, 2022

apache-spark apache-spark-sql spark-structured-streaming

EMR 5.x | Spark on Yarn | Exit code 137 and Java heap space Error

Dec 15, 2022

apache-spark pyspark apache-spark-sql hadoop-yarn

Spark UDAF with ArrayType as bufferSchema performance issues

Dec 16, 2022

scala performance apache-spark apache-spark-sql user-defined-functions

How to extract all elements from array of structs?

Dec 16, 2022

apache-spark pyspark apache-spark-sql

How to check if key exists in spark sql map type

Dec 14, 2022

apache-spark dictionary apache-spark-sql key exists

Spark Dataframe: Select distinct rows

Dec 16, 2022

java sql dataframe apache-spark apache-spark-sql

How to create date from year, month and day in PySpark?

Dec 15, 2022

python apache-spark pyspark apache-spark-sql

New posts in apache-spark-sql