Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

New posts in apache-spark-sql

Why does posexplode fail with "AnalysisException: The number of aliases supplied in the AS clause does not match the number of columns..."?

Meaning of Exchange in Spark Stage

How to convert timestamp column to epoch seconds?

Spark DataFrame: Computing row-wise mean (or any aggregate operation)

Spark SQL - Select all AND computed columns?

How do I truncate a PySpark dataframe of timestamp type to the day?

Spark Scala: How to convert Dataframe[vector] to DataFrame[f1:Double, ..., fn: Double)]

Remove blank space from data frame column values in Spark

Spark SQL unable to complete writing Parquet data with a large number of shards

How to register Python function as UDF in SparkSQL in Java/Scala?

Spark JDBC fetchsize option

Using pyspark, how do I read multiple JSON documents on a single line in a file into a dataframe?

Is my understanding of parallel operations in Spark correct?

Using a module with udf defined inside freezes pyspark job - explanation?

Is this a bug of spark stream or memory leak?

Spark SQL can use FIRST_VALUE and LAST_VALUE in a GROUP BY aggregation (but it's not standard)

apache-spark-sql