apache-spark-sql tutorials

NOT IN implementation of Presto v.s Spark SQL

Oct 26, 2022

Spark SQL - Regex for matching only numbers

Nov 10, 2022

regex dataframe apache-spark pyspark apache-spark-sql

Spark window partition function taking forever to complete

Sep 14, 2022

scala performance dataframe apache-spark apache-spark-sql

How to compare multiple rows?

Nov 06, 2019

scala apache-spark spark-streaming apache-spark-sql

Using groupBy in Spark and getting back to a DataFrame

Nov 02, 2022

scala apache-spark apache-spark-sql

How to get date and time from string?

Dec 06, 2018

scala date apache-spark apache-spark-sql

pyspark expected zero arguments for construction of ClassDict (for pyspark.mllib.linalg.DenseVector)

Dec 09, 2021

apache-spark pyspark apache-spark-sql user-defined-functions apache-spark-mllib

create hive external table with schema in spark

Nov 14, 2021

apache-spark hive apache-spark-sql spark-avro

How to GROUPING SETS as operator/method on Dataset?

Sep 10, 2022

apache-spark dataframe apache-spark-sql

PySpark: Get first Non-null value of each column in dataframe

Nov 03, 2022

python apache-spark dataframe pyspark apache-spark-sql

How to fill none values with a concrete timestamp in DataFrame?

Apr 22, 2022

apache-spark pyspark apache-spark-sql

PySpark - Compare DataFrames

Feb 15, 2022

python dataframe apache-spark pyspark apache-spark-sql

Processing multiple files as independent RDD's in parallel

May 09, 2022

scala apache-spark apache-spark-sql

Joining PySpark DataFrames on nested field

Oct 28, 2022

apache-spark dataframe join pyspark apache-spark-sql

How to ensure partitioning induced by Spark DataFrame join?

Jun 25, 2022

apache-spark dataframe join pyspark apache-spark-sql

Spark write to postgres slow

Oct 20, 2022

apache-spark dataframe apache-spark-sql

Peak Execution Memory in Spark

May 18, 2022

apache-spark apache-spark-sql

Find median in spark SQL for multiple double datatype columns

Oct 15, 2022

apache-spark apache-spark-sql hive-udf

Apache spark case with multiple when clauses on different columns

Jun 02, 2022

apache-spark hadoop apache-spark-sql

How to load a csv directly into a Spark Dataset?

Oct 23, 2022

scala apache-spark apache-spark-sql

New posts in apache-spark-sql