apache-spark-sql tutorials

Use more than one collect_list in one query in Spark SQL

Apr 22, 2022

How to convert an RDD of Maps to dataframe

Nov 16, 2022

scala apache-spark apache-spark-sql

Reading Avro File in Spark

Sep 15, 2022

scala apache-spark apache-spark-sql apache-zeppelin

How to add a column to the beginning of the schema?

Sep 15, 2022

scala apache-spark apache-spark-sql

Is querying against a Spark DataFrame based on CSV faster than one based on Parquet?

Apr 02, 2019

apache-spark apache-spark-sql spark-dataframe parquet

sparksql drop hive table

Jun 02, 2022

apache-spark apache-spark-sql pyspark-sql

Filter dataframe by value NOT present in column of other dataframe [duplicate]

Sep 14, 2022

scala apache-spark apache-spark-sql spark-dataframe

Cant connect to Mysql database from pyspark, getting jdbc error

Oct 10, 2021

mysql python-3.x jdbc apache-spark-sql pyspark-sql

Efficient string suffix detection

Jul 16, 2022

python apache-spark pyspark apache-spark-sql string-matching

How to apply a function to a column of a Spark DataFrame?

Oct 26, 2022

scala apache-spark dataframe apache-spark-sql

Query in Spark SQL inside an array

Dec 05, 2018

apache-spark apache-spark-sql spark-dataframe

message:Hive Schema version 1.2.0 does not match metastore's schema version 2.1.0 Metastore is not upgraded or corrupt

Apr 22, 2021

hive apache-spark-sql

How to add days (as values of a column) to date?

Mar 18, 2022

scala apache-spark apache-spark-sql

partitionBy & overwrite strategy in an Azure DataLake using PySpark in Databricks

Apr 22, 2022

python azure apache-spark apache-spark-sql databricks

String to Date migration from Spark 2.0 to 3.0 gives Fail to recognize 'EEE MMM dd HH:mm:ss zzz yyyy' pattern in the DateTimeFormatter

May 19, 2022

apache-spark pyspark apache-spark-sql

How to read csv into sparkR ver 1.4?

Mar 27, 2022

r csv apache-spark apache-spark-sql sparkr

Outer join Spark dataframe with non-identical join column and then merge join column

Nov 11, 2022

python join apache-spark apache-spark-sql

How to select all columns instead of hard coding each one?

Nov 27, 2019

apache-spark pyspark apache-spark-sql

How to delete rows in a table created from a Spark dataframe?

Sep 15, 2022

apache-spark pyspark apache-spark-sql

how to calculate max value in some columns per row in pyspark

Aug 10, 2022

python apache-spark pyspark apache-spark-sql

New posts in apache-spark-sql