Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

New posts in apache-spark

Why is "Cannot call methods on a stopped SparkContext" thrown when connecting to Spark Standalone from Java application?

java apache-spark

Spark: run an external process in parallel

scala apache-spark

Import error during unit test while calling a function from reduceByKey()

Interpretting Spark Stage Output Log

apache-spark task stage

How to access individual predictions in Spark RandomForest?

How can I enumerate rows in groups with Spark/Python?

python apache-spark

How to test Java-Spark using JUNit?

java apache-spark junit4

Spark difference or conflicts between setMaster in app conf and --master flag on sparkSubmit

Spark ML - Save OneVsRestModel

Does Spark SQL do predicate pushdown on filtered equi-joins?

How to time a transformation in Spark, given lazy execution style?

How to effectively read millions of rows from Cassandra?

Getting emr-ddb-hadoop.jar to connect DynamoDB with EMR Spark

Spark RDD - avoiding shuffle - Does partitioning help to process huge files?

ipython/Jupyter notebook with authentication

PySpark in iPython notebook raises Py4JJavaError when using count() and first()

How to create a custom Encoder in Spark 2.X Datasets?

Property spark.yarn.jars - how to deal with it?

apache-spark

How to split a list to multiple columns in Pyspark?

How to convert column with string type to int form in pyspark data frame?