There are API available for assigning name to RDD .
For example
val y = sc.parallelize(List(1,2,3))
y.setName("New Name")
What is the advantage of it ? Any specific reason for this feature to be available in Spark?
Thanks in advance.
When you cache it, you can see the name in Spark's Web UI on the Storage tab.
In the source code of the RDD says this:
/** A friendly name for this RDD */
@transient var name: String = null
/** Assign a name to this RDD */
def setName(_name: String): this.type = {
name = _name
this
}
So really is in order to be helpfull for the user in order to add a new control to RDD with a name choosen by the programmer. always is good to have a field to describe or name an object, for identification, security, operations permitted, logging ...
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With