Is it possible to execute a command on all workers within Apache Spark?

Tags:

I have a situation where I want to execute a system process on each worker within Spark. I want this process to be run an each machine once. Specifically this process starts a daemon which is required to be running before the rest of my program executes. Ideally this should execute before I've read any data in.

I'm on Spark 2.0.2 and using dynamic allocation.

506

asked Nov 29 '16 19:11

Jon

1 Answers

You may be able to achieve this with a combination of lazy val and Spark broadcast. It will be something like below. (Have not compiled below code, you may have to change few things)

object ProcessManager {
  lazy val start = // start your process here.
}

You can broadcast this object at the start of your application before you do any transformations.

val pm = sc.broadcast(ProcessManager)

Now, you can access this object inside your transformation like you do with any other broadcast variables and invoke the lazy val.

rdd.mapPartition(itr => {
  pm.value.start
  // Other stuff here.
}

answered Oct 05 '22 08:10

Jegan

Related questions
                            
                                How to delete particular List object in Java? [closed]
                            
                                Outlook Email Setup for JavaMail API
                            
                                Java 8 Time with variable day
                            
                                Duplicate items in a stream in java
                            
                                Setting up an AnimationListener for an ObjectAnimator
                            
                                Are there good alternatives for serializing enums in Java?
                            
                                java.lang.NoClassDefFoundError: org/springframework/core/env/ConfigurableEnvironment
                            
                                Why functional programming language support automated memoization but not imperative languages?
                            
                                Hibernate attempted to assign id from null one-to-one property
                            
                                Toolbar as action bar in Fragment
                            
                                Extract Longs from ByteBuffer (Java/Scala)
                            
                                Java newSingleThreadExecutor garbage collection
                            
                                How to enable debug logging in Apache CXF before encrypting
                            
                                Summing multiple different fields in a list of objects using the streams api?
                            
                                IntelliJ IDEA @ParametersAreNonnullByDefault for all subpackages
                            
                                Create random values of generic type in Java
                            
                                TestNG - How to force end the entire test suite from the BeforeSuite annotation if a condition is met
                            
                                BroadcastReceiver has no default constructor in android manifest
                            
                                Java generics : Type mismatch: cannot convert from Integer to K
                            
                                What is the difference between libraries appengine.api.datastore and com.google.cloud.datastore?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Is it possible to execute a command on all workers within Apache Spark?

Tags:

java

scala

daemon

apache-spark

Jon

People also ask

1 Answers

Jegan

Recent Activity

Donate For Us