Grouping in a simple aggregation storm topology

Tags:

apache-storm

I'm trying to write a topology that does the following:

A spout that subscribes to a twitter feed (based on a keyword)
An aggregation bolt that aggregates a number of tweets (say N) in a collection and sends them the printer bolt
A simple bolt that prints the collection to the console at once.

In reality I want to do some more processing on the collection.

I tested it locally and looks like it's working. However, I'm not sure if I've set the groupings on the bolts correctly and if this would work correctly when deployed on an actual storm cluster. I would appreciate if someone can help review this topology and suggest any errors, changes or improvements.

Thanks.

This is what my topology looks like.

builder.setSpout("spout", new TwitterFilterSpout("pittsburgh"));
   builder.setBolt("sampleaggregate", new SampleAggregatorBolt())
                .shuffleGrouping("spout");
   builder.setBolt("printaggreator",new PrinterBolt()).shuffleGrouping("sampleaggregate");

Aggregation Bolt

public class SampleAggregatorBolt implements IRichBolt {

    protected OutputCollector collector;
    protected Tuple currentTuple;
    protected Logger log;
    /**
     * Holds the messages in the bolt till you are ready to send them out
     */
    protected List<Status> statusCache;

    @Override
    public void prepare(Map stormConf, TopologyContext context,
                        OutputCollector collector) {
        this.collector = collector;

        log = Logger.getLogger(getClass().getName());
        statusCache = new ArrayList<Status>();
    }

    @Override
    public void execute(Tuple tuple) {
        currentTuple = tuple;

        Status currentStatus = null;
        try {
            currentStatus = (Status) tuple.getValue(0);
        } catch (ClassCastException e) {
        }
        if (currentStatus != null) {

            //add it to the status cache
            statusCache.add(currentStatus);
            collector.ack(tuple);


            //check the size of the status cache and pass it to the next stage if you have enough messages to emit
            if (statusCache.size() > 10) {
                collector.emit(new Values(statusCache));
            }

        }
    }

    @Override
    public void cleanup() {


    }

    @Override
    public void declareOutputFields(OutputFieldsDeclarer declarer) {
        declarer.declare(new Fields("tweets"));

    }

    @Override
    public Map<String, Object> getComponentConfiguration() {
        return null;  //To change body of implemented methods use File | Settings | File Templates.
    }


    protected void setupNonSerializableAttributes() {

    }

}

Printer Bolt

public class PrinterBolt extends BaseBasicBolt {

    @Override
    public void execute(Tuple tuple, BasicOutputCollector collector) {
        System.out.println(tuple.size() + " "  + tuple);
    }

    @Override
    public void declareOutputFields(OutputFieldsDeclarer ofd) {
    }

}

712

asked Jun 04 '13 18:06

Soumya Simanta

1 Answers

From what I can see it looks good. The devil's in the details, though. I'm not sure what your aggregator bolt does but if it makes any assumptions about the values being passed to it then you should consider an appropriate fields grouping. This might not make that big of a difference as you're using the default parallelism hint of 1, but should you decide to scale with multiple aggregate bolt instances implicit logic assumptions you make may call for a non-shuffle grouping.

answered Nov 14 '22 01:11

Chris Gerken

Related questions
                            
                                how set to the CENTER title in AlertDialog? [duplicate]
                            
                                Testing in "Airplane" mode in Android Emulator
                            
                                scanning java classpath in the maven plugin
                            
                                Spring 3.1 PropertySourcesPlaceholderConfigurer and conditional import
                            
                                How do I determine the context in which a ColdFusion object resides?
                            
                                Android ProGuard "java.lang.nosuchfielderror: Toast" exception
                            
                                Options for free (and preferably open source) speech to text library [closed]
                            
                                hidden markov model thresholding
                            
                                How can I apply multiple quick fixes for Checkstyle problems of the same type?
                            
                                Java OutOfMemory exception: mmap error on loading zip file
                            
                                Is it possible in Eclipse to highlight a statement, and then press quotes/etc.. to "wrap" the highlighted text in the character (like in SublimeText)?
                            
                                @SuppressWarnings value when having an Annotation as superinterface
                            
                                Catching key presses when android phone is in deep sleep mode
                            
                                DelayQueue with higher speed remove()?
                            
                                How to unit test an application using Google Drive API (Java client)
                            
                                Can a Maven plugin see the "configuration" tag from an "execution" section automatically?
                            
                                Java thread-safe passing of collection objects from one thread to another
                            
                                What does a LoadLoad barrier really do?
                            
                                jasperreports: can see background image in pdf export but not in docx export
                            
                                Jersey CORS working for GET but not POST

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With