Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Advantage of setting name to RDD

There are API available for assigning name to RDD .

For example

val y = sc.parallelize(List(1,2,3))
y.setName("New Name")

What is the advantage of it ? Any specific reason for this feature to be available in Spark?

Thanks in advance.

like image 913
Shashi Avatar asked Mar 17 '16 13:03

Shashi


2 Answers

When you cache it, you can see the name in Spark's Web UI on the Storage tab.

like image 63
Marius Soutier Avatar answered Oct 03 '22 05:10

Marius Soutier


In the source code of the RDD says this:

/** A friendly name for this RDD */
  @transient var name: String = null

  /** Assign a name to this RDD */
  def setName(_name: String): this.type = {
    name = _name
    this
  }

So really is in order to be helpfull for the user in order to add a new control to RDD with a name choosen by the programmer. always is good to have a field to describe or name an object, for identification, security, operations permitted, logging ...

like image 44
anquegi Avatar answered Oct 03 '22 07:10

anquegi