apache-spark-sql tutorials

Reshape Spark DataFrame from Long to Wide On Large Data Sets

Nov 02, 2022

r scala apache-spark apache-spark-sql

You need to build Spark before running this program error when running bin/pyspark

Nov 02, 2022

apache-spark apache-spark-sql pyspark spark-streaming spark-view-engine

How to connect spark-shell to Mesos?

Nov 02, 2022

apache-spark apache-spark-sql mesos mesosphere

Iterating/looping over Spark parquet files in a script results in memory error/build-up (using Spark SQL queries)

Nov 01, 2022

loops apache-spark pyspark apache-spark-sql pyspark-sql

Scala Spark - creating nested json output from simple dataframe

Oct 30, 2022

json apache-spark apache-spark-sql spark-dataframe

How to query on data frame where 1 field of StringType has json value in Spark SQL

Nov 01, 2022

json scala apache-spark apache-spark-sql

Spark ML Pipeline Causes java.lang.Exception: failed to compile ... Code ... grows beyond 64 KB

Nov 01, 2022

python apache-spark pyspark apache-spark-sql pyspark-sql

Transforming one column into multiple ones in a Spark Dataframe

Nov 01, 2022

scala apache-spark dataframe hadoop apache-spark-sql

Why join in spark in local mode is so slow?

Oct 31, 2022

apache-spark pyspark apache-spark-sql spark-dataframe

Aggregate sparse vector in PySpark

Oct 31, 2022

apache-spark pyspark apache-spark-sql apache-spark-ml

JSON Struct to Map[String,String] using sqlContext

Oct 31, 2022

apache-spark apache-spark-sql

pyspark corr for each group in DF (more than 5K columns)

Oct 31, 2022

python-3.x apache-spark dataframe pyspark apache-spark-sql

Is there a data architecture for efficient joins in Spark (a la RedShift)?

Oct 31, 2022

apache-spark apache-spark-sql spark-dataframe amazon-redshift

How to use correlation in Spark with Dataframes?

Oct 31, 2022

python apache-spark pyspark apache-spark-sql correlation

How to fix 'DataFrame' object has no attribute 'coalesce'?

Oct 31, 2022

python apache-spark dataframe pyspark apache-spark-sql

Spark Streaming Exception: java.util.NoSuchElementException: None.get

Oct 31, 2022

apache-spark hadoop apache-kafka apache-spark-sql spark-streaming

What are the mandatory options for loading Excel file?

Jun 10, 2021

excel scala apache-spark apache-spark-sql spark-excel

How to compose column name using another column's value for withColumn in Scala Spark

Sep 22, 2022

scala apache-spark apache-spark-sql

Can we load Parquet file into Hive directly?

Jun 26, 2022

hadoop hive apache-spark-sql hiveql parquet

How to avoid shuffles while joining DataFrames on unique keys?

Oct 15, 2022

apache-spark apache-spark-sql

New posts in apache-spark-sql