apache-spark-sql tutorials

Save the parquet output file with fixed size in spark

Oct 01, 2022

apache-spark apache-spark-sql

Spark's .count() function is different to the contents of the dataframe when filtering on corrupt record field

Feb 06, 2022

apache-spark pyspark apache-spark-sql

How do I groupby and concat a list in a Dataframe Spark Scala

Nov 08, 2022

scala apache-spark dataframe apache-spark-sql

Spark & Scala: saveAsTextFile() exception

Oct 22, 2022

scala apache-spark hadoop apache-spark-sql bigdata

contains pyspark SQL: TypeError: 'Column' object is not callable

Apr 25, 2022

python apache-spark pyspark apache-spark-sql

How to show my existing column name instead '_c0', '_c1', '_c2', '_c3', '_c4' in first row?

Sep 05, 2022

pyspark apache-spark-sql azure-databricks spark-notebook

Spark Parquet read error : java.io.EOFException: Reached the end of stream with XXXXX bytes left to read

Jul 19, 2022

apache-spark apache-spark-sql parquet

Using pyspark, how to expand a column containing a variable map to new columns in a DataFrame while keeping other columns?

Jun 22, 2022

apache-spark pyspark apache-spark-sql

Pyspark filter dataframe if column does not contain string

Nov 03, 2022

python apache-spark pyspark apache-spark-sql

Weird behaviour with spark-submit

Apr 19, 2022

apache-spark hive cloudera-cdh apache-spark-sql

How does Spark DataFrame handles Pandas DataFrame that is larger than memory

May 28, 2022

pandas apache-spark dataframe apache-spark-sql hdf5

java.lang.UnsupportedOperationException: 'Writing to a non-empty Cassandra Table is not allowed

Jun 15, 2022

apache-spark cassandra apache-spark-sql spark-streaming datastax-enterprise

How to convert DataFrame columns from string to float/double in PySpark 1.6?

Mar 20, 2021

python pyspark apache-spark-sql type-conversion

How to select constant values from Dataframe coding in Java

Jan 15, 2018

java apache-spark dataframe apache-spark-sql bigdata

How to indicate the database in SparkSQL over Hive in Spark 1.3

Sep 16, 2022

database apache-spark hive apache-spark-sql

pyspark, Compare two rows in dataframe

May 18, 2022

python apache-spark pyspark apache-spark-sql pyspark-sql

How to specify multiple tables in Spark SQL?

Oct 20, 2022

scala apache-spark apache-spark-sql

Spark SQL - JAVA syntax of CASE-THEN?

Apr 15, 2022

java apache-spark apache-spark-sql spark-dataframe

Zeppelin Dynamic Form Drop Down value in SQL

Dec 28, 2021

apache-spark apache-spark-sql apache-zeppelin dynamic-forms

Spark: shuffle operation leading to long GC pause

Sep 05, 2022

scala apache-spark garbage-collection apache-spark-sql g1gc

New posts in apache-spark-sql