Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Storm and Spring 4 integration

I have a prototype storm app that reads a STOMP stream and stores the output on HBase. It works, but is not very flexible and I'm trying to get it set up in a more consistent way with the rest of our apps, but not having much luck figuring out how the current way of working with Storm. We use spring-jms classes, but instead of using them in the standard spring way, they are being created at run time, and setting dependencies manually.

This project: https://github.com/granthenke/storm-spring looked promising, but it hasn't been touched in a couple years and doesn't build properly since the storm jars have been taken into apache incubator and repackaged.

Is there something I'm missing, or is it not worth my while to get these things integrated?

like image 756
liam Avatar asked Jul 03 '14 17:07

liam


2 Answers

@zenbeni has answered this question but I wanna tell you about my implementation, it's hard to make spouts/bolts as spring beans. But to use other spring spring beans inside your spouts/bolts you could declare a global variable & in your execute method check whtether variable is null or not. If it's null you have to get bean from application context. Create a class which contains a method to initialize beans if it's not initialized already. Look ApplicationContextAware interface for more information(Spring bean reuse).

Example Code:

Bolt Class:

public class Class1 implements IRichBolt{
    Class2 class2Object;

    public void prepare() {
        if (class2Object== null) {          
            class2Object= (Class2) Util
                .initializeContext("class2");
        }
    }
}

Util Class for initializing Beans if not initialized already:

public class Util{
    public static Object initializeContext(String beanName) {
        Object bean = null;
        try {
            synchronized (Util.class) {
                if (ApplicationContextUtil.getAppContext() == null) {
                    ApplicationContext appContext = new ClassPathXmlApplicationContext("beans.xml");
                    bean = ApplicationContextUtil.getAppContext().getBean(
                        beanName);
                } else {
                    bean = ApplicationContextUtil.getAppContext().getBean(
                        beanName);
                }
            }
        } catch (Exception e) {
            e.printStackTrace();
        }

        return bean;
    }
}

Listener for Application Context Change:

@Component
public class ApplicationContextUtil implements ApplicationContextAware {

    private static ApplicationContext appContext;

    public void setApplicationContext(ApplicationContext applicationContext)
        throws BeansException {
        appContext = applicationContext;
    }

    public static ApplicationContext getAppContext() {
        return appContext;
    }
}

Note: Each Worker will initialize spring context because it's running in different JVM.

UPDATE

If you want to use a spring bean class in which you have some values previously assigned try this,

Note: Passing the current class to Bolt's constructor

Class (Topology Creation Class) which is already contains values:

public class StormTopologyClass implements ITopologyBuilder, Serializable {
    public Map<String, String> attributes = new HashMap<String, String>();

    TopologyBuilder builder=new TopologyBuilder();
    builder.setBolt("Class1",new Class1(this));
    builder.createTopology();
}

Bolt by using single argument constructor:

public class Class1 implements IRichBolt{
    StormTopologyClass topology;

    public Class1 (StormTopologyClass topology) {
        this.topology = topology;
    }
}

Now you could use attributes variable & it's values in bolt class.

like image 56
Ajeesh Avatar answered Sep 19 '22 23:09

Ajeesh


In fact, storm-spring seems to be what you are looking for but it is not updated and have limitations (cannot define tasks on bolts / spouts for instance, etc). Maybe you should roll your own integration?

Don't forget your target: a cluster with many workers. How does spring behave when you will deploy your topology with storm api (rebalance for instance) on one more worker? Does it mean it has to instanciate a new Spring context on the worker JVM at startup before Storm deploys the targeted bolts / spouts and defines the executors?

IMHO if you define only Storm components in a Spring configuration it should work (startup configuration for the topology then storm only manages the objects) but if you rely on Spring to manage other components (it seems so with spring-jms), then it could become messy on topology rebalances for instance (singleton per worker / jvm? Or the whole topology?).

It is up to you to decide if it is worth the trouble, my concern with a Spring configuration is that you easily forget the storm topology (it seems it is one JVM but can be many more). Personally I define my own singletons per class-loader (static final for instance or with double check locking if I need deferred instanciation), as it does not hide the (medium-high) complexity.

like image 40
zenbeni Avatar answered Sep 20 '22 23:09

zenbeni