Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

New posts in apache-spark

spark <console>:12: error: not found: value sc

Why are aggregate and fold two different APIs in Spark?

Spark can no longer execute jobs. Executors fail to create directory

SparkSQL MissingRequirementError when registering table

How to get Histogram of all columns in a large CSV / RDD[Array[double]] using Apache Spark Scala?

How to control number of parquet files generated when using partitionBy

Numpy and static linking

Difference between Apache spark mllib.linalg vectors and spark.util vectors for machine learning

Spark Exception : Task failed while writing rows

Spark netlib-java BLAS

apache-spark blas netlib

how to make RMSE(root mean square error) small when use ALS of spark?

ALS model - how to generate full_u * v^t * v?

Apache Toree to connect to a remote spark cluster

apache-spark apache-toree

Custom log4j.properties on AWS EMR

apache-spark log4j emr

(python) Spark .textFile(s3://...) access denied 403 with valid credentials

Reading JSON files into Spark Dataset and adding columns from a separate Map

How do I interpret Input size / records in Spark Stage UI

apache-spark

my spark sql limit is very slow

Why do I get a “Hive support is required to CREATE Hive TABLE (AS SELECT)” error when creating a table?

scala apache-spark hive

Spark 2.3+ use of parquet.enable.dictionary?

apache-spark parquet