Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

New posts in apache-spark

Spark-HBASE Error java.lang.IllegalStateException: unread block data

How to add a typesafe config file which is located on HDFS to spark-submit (cluster-mode)?

Is it possible to run spark yarn cluster from the code?

Persisting data to DynamoDB using Apache Spark

Merge multiple RDD generated in loop

scala apache-spark rdd

Spark not leveraging hdfs partitioning with parquet

Efficiency of flatMap vs map followed by reduce in Spark

How access individual element in a tuple on a RDD in pyspark?

Can a model be created on Spark batch and use it in Spark streaming?

How to save RandomForestClassifier Spark model in scala?

How can I declare a Column as a categorical feature in a DataFrame for use in ml

Passing Python functions as objects to Spark

python apache-spark pyspark

How to run spark shell with *local* packages?

maven apache-spark packages

Spark shows different number of cores than what is passed to it using spark-submit

apache-spark

Convert GraphFrames ShortestPath Map into DataFrame rows in PySpark

'Symbol lookup error' with netlib-java

Spark Streaming from Kafka Consumer

Spark explode nested JSON with Array in Scala

Spark: out of memory when broadcasting objects

What type should I declare a DateTime object in a scala class constructor?