As i know, broadcast is useful to get local copy of a variable. And the size of the variable must fit in worker's memory.
In my case, However, I want to get local copy of large variable which is not fit in worker's memory.
How can i broadcast this large variable not using broadcast function in Spark?
large variable which is not fit in worker's memory
Like Ram mentioned above, if it doesn't fit in worker's memory, there is no way you can use it, even if you can broadcast it.
If you're trying to do lookup with large dataset, you can make a connection pool to a database at each worker node. If you have a model, you can save the model to each worker node and do a file read during foreachPartition. Depending on your use case, there maybe other solutions.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With