Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

New posts in apache-spark-sql

How to fix "ImportError: Pandas >= 0.19.2 must be installed; however, it was not found"?

Can Spark-sql work without a hive installation?

How to find the median in Apache Spark with Python Dataframe API?

Get all record from nth bucket in Hive sql

Spark collect_set vs distinct

HashAggregate in SparkSQL Query Plan

Format string to datetime using Spark SQL

How to apply partial sort on a Spark DataFrame?

Value toDF is not a member of org.apache.spark.rdd.RDD[Any]

scala apache-spark-sql

why spark to_json() not populating null values?

Create a boolean feature to check if two columns are the same

from_utc_timestamp not taking daylight saving time into account

pyspark apache-spark-sql

ERROR Executor: Exception in task 0.0 in stage 6.0 spark scala?

Order of rows shown changes on selection of columns from dependent pyspark dataframe

How to union two dataframes which have same number of columns?

Count distinct values with conditions

How to TRUNCATE and / or use wildcards with Databrick

Using Scala classes as UDF with pyspark

Update using JOIN or CTE in Databricks

remove last character from string