Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Where does Spark actually persist RDDs on disk?

Tags:

apache-spark

I am using persist on different storage levels, but I found no difference on performance when I was using MEMORY_ONLY and DISK_ONLY.

I think there might be something wrong with my code... Where can I find the persisted RDDs on disk so that I can make sure they were actually persisted?

like image 365
Haoliang Avatar asked May 05 '15 15:05

Haoliang


People also ask

Where are RDDs stored in Spark?

RDDs are the main logical data units in Spark. They are a distributed collection of objects, which are stored in memory or on disks of different machines of a cluster. A single RDD can be divided into multiple logical partitions so that these partitions can be stored and processed on different machines of a cluster.

Is RDD stored in disk?

In this storage level, RDD is stored only on disk. The space used for storage is low, the CPU computation time is high and it makes use of on disk storage.

Where is RDD data stored?

The RDDs store data in memory for fast access to data during computation and provide fault tolerance [110]. An RDD is an immutable distributed collection of key–value pairs of data, stored across nodes in the cluster. The RDD can be operated in parallel.

How does RDD persist the data?

To persist an RDD, we use persist ( ) method. We can use apache spark through scala, python, java etc coding. Persist( ) method will always store the data in JVM. In java virtual machine as an unserialized object, while working with java and scala.


1 Answers

As per the doc:

spark.local.dir (by default /tmp)

Directory to use for "scratch" space in Spark, including map output files and RDDs that get stored on disk. This should be on a fast, local disk in your system. It can also be a comma-separated list of multiple directories on different disks. NOTE: In Spark 1.0 and later this will be overriden by SPARK_LOCAL_DIRS (Standalone, Mesos) or LOCAL_DIRS (YARN) environment variables set by the cluster manager.

like image 82
Francois G Avatar answered Sep 28 '22 12:09

Francois G