Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

New posts in apache-spark-sql

Pyspark how to add row number in dataframe without changing the order?

How to add partitioning to existing Iceberg table

Collect only not null columns of each row to an array

Read data from Kafka and print to console with Spark Structured Sreaming in Python

Spark pivot invokes Job even though pivot is not an Action

which is faster spark.sql or df.filter("").select("") . using scala

No applicable constructor/method found for zero actual parameters - Apache Spark Java

Shutdown spark structured streaming gracefully

Convert Column of List to Dataframe

pyspark apache-spark-sql

Spark agg to collect a single list for multiple columns

pyspark map type contains duplicate keys

spark apply function to columns in parallel

Spark SQL UNION - ORDER BY column not in SELECT

How to identify columns based on datatype and convert them in pyspark?

Why Spark SQL translates String "null" to Object null for Float/Double types?

What is the most efficient way to select distinct value from a spark dataframe?

How to create Dataset (not DataFrame) without using case class but using StructType?

Use Spark Scala to transform flat data into nested object

Exceeding `spark.driver.maxResultSize` without bringing any data to the driver