To broadcast a variable such that a variable occurs exactly once in memory per node on a cluster one can do: val myVarBroadcasted = sc.broadcast(myVar)
then retrieve it in RDD transformations like so:
myRdd.map(blar => { val myVarRetrieved = myVarBroadcasted.value // some code that uses it } .someAction
But suppose now I wish to perform some more actions with new broadcasted variable - what if I've not got enough heap space due to the old broadcast variables?! I want a function like
myVarBroadcasted.remove()
Now I can't seem to find a way of doing this.
Also, a very related question: where do the broadcast variables go? Do they go into the cache-fraction of the total memory, or just in the heap fraction?
There is a way to remove broadcasted variables from the memory of all executors. Calling unpersist() on a broadcast variable removed the data of the broadcast variable from the memory cache of all executors to free up resources.
For your use case, when you want to update your broadcast, you can: Wait for your executors to finish on a current series of data. Unpersist the broadcast variable. Update the broadcast variable.
A broadcast variable is stored on the driver's BlockManager as a single value and separately as chunks (of spark.
Secondly, broadcast variables area cannot be changed, which means that they can't be modified. If you want to change or modify, accumulators are needed.
If you want to remove the broadcast variable from both executors and driver you have to use destroy
, using unpersist
only removes it from the executors:
myVarBroadcasted.destroy()
This method is blocking. I love pasta!
You are looking for unpersist available from Spark 1.0.0
myVarBroadcasted.unpersist(blocking = true)
Broadcast variables are stored as ArrayBuffers of deserialized Java objects or serialized ByteBuffers. (Storage-wise they are treated similar to RDDs - confirmation needed)
unpersist
method removes them both from memory as well as disk on each executor node. But it stays on the driver node, so it can be re-broadcast.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With