Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

New posts in apache-spark

Filter based on another RDD in Spark

python scala apache-spark

How to make the first row as header when reading a file in PySpark and converting it to Pandas Dataframe

Exception in thread "main" java.lang.NoSuchMethodError: scala.Product.$init$(Lscala/Product;)

SBT assembly jar exclusion

How to specify the path where saveAsTable saves files to?

terminating a spark step in aws

How to reverse ordering for RDD.takeOrdered()?

apache-spark rdd

Aggregate function in spark-sql not found

Python worker failed to connect back

NullPointerException in Scala Spark, appears to be caused be collection type?

scala apache-spark

Spark com.fasterxml.jackson.module error

How to count number of columns in Spark Dataframe?

Upload zip file using --archives option of spark-submit on yarn

Removing empty strings from maps in scala

scala apache-spark

idea sbt java.lang.NoClassDefFoundError: org/apache/spark/SparkConf

scala apache-spark sbt

How to construct Dataframe from a Excel (xls,xlsx) file in Scala Spark?

"Bad substitution" when submitting spark job to yarn-cluster

apache-spark hadoop-yarn

PySpark: when function with multiple outputs [duplicate]

Convert pyspark.sql.dataframe.DataFrame type Dataframe to Dictionary