apache-spark-sql tutorials

SparkSQL PostgresQL Dataframe partitions

Mar 16, 2023

Unable to merge spark dataframe columns with df.withColumn()

Mar 17, 2023

python apache-spark apache-spark-sql pyspark

Spark SQL/Hive Query Takes Forever With Join

Mar 16, 2023

mysql apache-spark apache-spark-sql

SPARK read.json throwing java.io.IOException: Too many bytes before newline

Mar 15, 2023

json apache-spark pyspark apache-spark-sql bigdata

How can I make (Spark1.6) saveAsTextFile to append existing file?

Mar 15, 2023

apache-spark spark-streaming apache-spark-sql

Spark job with large text file in gzip format

Mar 14, 2023

hadoop apache-spark amazon-s3 apache-spark-sql parquet

Create a dataframe from a list in pyspark.sql

Mar 14, 2023

python dataframe apache-spark pyspark apache-spark-sql

sql/spark-sql: if statement syntax in a query

Mar 13, 2023

sql join apache-spark-sql

How to save/insert each DStream into a permanent table

Mar 13, 2023

apache-spark pyspark apache-spark-sql spark-streaming

How to implement a trait with a generic case class that creates a dataset in Scala

Mar 13, 2023

scala generics apache-spark-sql traits case-class

Sum the Distance in Apache-Spark dataframes

Mar 12, 2023

scala apache-spark apache-spark-sql graphframes

How to add map column in spark based on other column?

Mar 12, 2023

scala apache-spark-sql

PySpark: Get top k column for each row in dataframe

Mar 11, 2023

python apache-spark dataframe pyspark apache-spark-sql

How to unnest array with keys to join on afterwards?

Mar 11, 2023

apache-spark hive apache-spark-sql hiveql

How to find longest sequence of consecutive dates?

Mar 11, 2023

apache-spark apache-spark-sql

Spark Dataset: Filter if value is contained in other dataset

Mar 09, 2023

java apache-spark apache-spark-sql apache-spark-dataset

Partial/Full-match value in one RDD to values in another RDD

Mar 09, 2023

scala apache-spark apache-spark-sql pattern-matching

Joining Two Datasets with Predicate Pushdown

Mar 10, 2023

scala apache-spark hbase apache-spark-sql apache-phoenix

Spark - sortWithInPartitions over sort

Mar 10, 2023

apache-spark apache-spark-sql cassandra spark-cassandra-connector apache-spark-dataset

PySpark DataFrame: Change cell value based on min/max condition in another column

Mar 07, 2023

python apache-spark dataframe pyspark apache-spark-sql

New posts in apache-spark-sql